# Week 3

## Vision and Visuals

sources include: James Helman's Siggraph '93 Applied VR course notes, Dennis Proffitts '94 Developing Advanced VR Applications course notes, Lou Harrison's Siggraph '97 Stereo Computer Graphics for Virtual Reality notes, and the IRIS Performer guide.

The eye has a bandwidth of approx 1 gigabyte per second - much greater than the other senses

Temporal Resolution

The real world doesn't flicker. CRTs do flicker because the image is constantly being refreshed. If the image isn't refreshed fast enough, we perceive flickering. Some people can perceive flickering even at 60Hz (the image being refreshed 60 times per second) for a bright display with a large field of view but most people stop perceiving the flicker between 15Hz (for dark images) and 50Hz (for bright images).

Luminance

The eye has a dynamic range of 7 orders of magnitude

Eye is sensitive to ratios of intensities not absolute magnitude. Brightness = Luminance^0.33. To make something appear n times brighter the luminance must be increased by n^3.

Colour

Most perceptual processes are driven by intensity not colour. Motion system is colour blind, depth perception is colour blind, object recognition is colour blind.

but uniquely coloured objects are easy to find

Field of View

Each eye has approximately 150 degrees horizontal (60 degrees towards the nose and 90 degrees to the side) and 120 degrees vertically (50 degrees up and 80 degrees down)

Visual Acuity

Below is an 'eye chart' showing the resolutions of various VR devices. From left to right: 20/20, CRT 20/40, HMD 20/425, BOOM 20/85, and CAVE 20/110 using the Snellen fraction (20/X where this viewer sees at 20 feet detail that the average person can see at x feet, 20/200 is legally blind)

How the Eye Works

You probably heard a lot about this from CS 488. Here are my course notes from CC 488 on the topic. Here is a nice page on colour blindness: http://www.toledo-bend.com/colorblind/Ishihara.html

Real World Depth Cues

1. Occlusion (hidden surfaces)
2. Perspective Projection (parallel lines meet at infinity)
3. Binocular Disparity
4. Motion Parallax (due to head motion)
5. Convergence (rotation of the eyes to view a close object)
6. Accommodation (change of shape of eye to view a close object - focus)
7. Atmospheric (fog)

Computer graphics give 1,2,7,8
VR gives 3,4,5
nothing for 6 yet.

In VR the brain is getting two different cues about the virtual world. Some of these cues indicate this world is 3D (convergence and stereopsis). Some of these cues indicate that the world is flat (accommodation).

The eyes are focusing on the flat screen but they are converging depending on the position of the virtual objects

How to Generate Stereo Images

We have stereo vision because each eye sees a slightly different image. Well, almost all of us (90-95%) do.

Presenting stereo imagery through technology is over 150 years old. Here is a nice history of the stereopticon http://www.bitwise.net/~ken-bill/stereo.htm

As a kid you might have read some 3D comics which came with red/ blue (anaglyphic) glasses, or you might have even seen a movie with the red/blue glasses (maybe down at U of Chicago's Documentary Film Group with films like 'House of Wax' or 'Creature from the Black Lagoon' or 'It Came from Outer Space')

The images look like the one below (and if you have a pair of red/blue glasses and a correctly calibrated display then this image will become 3D.) The coloured lenses make one image more visible to one of your eyes and less visible to your other eye.

and here are a couple more pics from an Ocan Going Core Drilling ship

As a kid you may have had a View-Master(tm) and saw the cool 3D pictures. Those worked exactly the same way. Your left eye was shown one image on the disc, while your right eye was shown a different image,

One inexpensive way to do this is to draw two slightly different images onto the screen, place them next to each other and tell the person to fuse the stereo pair into a single image. This is easy for some people, very hard for other people, and impossible for a few people.

Some of these images require your left eye to look at the left image, others require your left eye to look at the right image.

To see the pictures below as a single stereo image look at the left image with your right eye and the right image with your left eye. If you aren't used to doing this then try this: Hold a finger up in front of your eyes between the two images on the screen and look at the finger. You should notice that the two images on the screen are now 4. As you move the finger towards or away from your head the two innermost images will move towards or away from each other. When you get them to merge together (that is you only see 3 images on the screen) then all you have to do is re-focus your eyes on the screen rather than on your finger. Remove your finger, and you should be seeing a 3D image in the middle.

It can be a bit of a strain to get 3D images this way, so animated 3D computer graphics are not done this way.

You either have to isolate a person's eyes and feed a different video signal to each eye (like an animated View-Master) or have a single screen, alternate the left eye and right eye images, and cover up whichever eye should not be seeing the image (a high-tech version of the red/blue glasses.)

A very important thing to keep in mind is what kind of display you will be using, and the space that it will be used in. This is especially true of projection-based systems. Will the room be dark enough for you to get the colours / brightness / contrast that you expect?

If the system is front-projection then how close will the user be able to get to the display?

Flat panel displays have a different problem - they are designed for on-axis viewing and the further you are off-axis increases the chance that you will see degraded colour / contrast / stereo vision.

Some Terminology

Horizontal Parallax - when the retinal images of an object fall on disparate points on the two retinas, these points differ only in their horizontal position. The value given by R - L.

Stereo Window (Plane) - the point at which there is no difference in parallax between the two eye views - usually at the same depth as the monitor screen or the projection surface

Homologous Points - points which correspond to each other in the separate eye views

Interocular Distance - the distance between the viewer's left and right eyes, usually about 2.5 inches

Positive Parallax - the point lies behind the stereo window

Zero Parallax - the point is at the same depth as the stereo window

Negative Parallax - the point lies in front of the stereo window

Vertical Displacement - vertical parallax between homologous points relative to the line that the two eyes form

Interocular Crosstalk (Ghosting) - when one eye can see part of the other eye's view as well

Updating Visuals based on Head Tracking

Naively we would like to update the graphics every frame in order to use the most recent head positions.

Since there will be jitter in the tracker values and latencies to deal with, this may result in the image jittering.

One way to avoid this is to only update the image when the head has moved (or rotated) a certain amount so the software knows that the motion was intentional.

Another option is to interpolate between the previous and current position and rotation values in order to smooth out this motion. This results in smoother transitions but will also increase the lag slightly.

Another option is to extrapolate into the future by predicting how the user is going to move in the next couple seconds.

off-axis projection

In an HMD or the boom, the screens in front of each user's eye move with the user - that is the screens are always perpendicular to the eye's line of site (assuming the eyes are looking straight ahead.)

This allows the traditional computer graphics 'camera paradigm' to be extended so that there are 2 cameras in use - one of each eye.

In the CAVE or Fish Tank VR, this is not the case. The projection planes stay at a fixed position as the user moves around. The user may not be looking perpendicular to the screen and certainly cant be looking perpendicular to all of the screens in the CAVE simultaneously - in this case 'off-axis projection' is used.

How to generate Graphics Quickly

Naive Approach

2. poll hand sensor
3. update virtual world
4. draw world for left eye
5. draw world for right eye
6. display images

In VR systems like the CAVE this is more complicated because the scene must be drawn for the left and right eye on multiple projection places

Pipelined approach (from SGI Performer manual)

Often when moving through a virtual world, the world will seem to speed up or slow down based on the complexity of the scene. Performer allows the programmer to set a maximum frame rate - this is really helpful when new faster hardware appears.

It also allows the user to set 'stress values' allowing it to deal with situations when the frame rate drops too low, so scene simplification or other techniques are performed.

Models can be replaced by models with less detail

3D models of far away objects can be replaced by texture mapped billboards

The horizon can be moved in - moving in Z-far and perhaps covering this with fog

A less complex lighting model can be used

Simulator Sickness

2 things are needed: a functioning vestibular system (canals in the inner ear) and a sense of motion

Symptoms: Nausea, eyestrain, blurred vision, difficulty concentrating, headache, drowsiness, fatigue

These symptoms can persist after the VR experience is finished.

Causes: still unknown but one common hypothesis is a mismatch between visual motion (what your eyes tell you) and the vestibular system (what your ears tell you)

Why would this cause us to become sick? Possibly an inherited trait - a mismatch between the eyes and ears might be caused by ingesting a poisonous substance so vomiting would be helpful in that case.

sense of motion is required

bright images are more likely to cause it than dark ones

wide field of view is more likely to cause it than narrow field of view

HMDs are more likely to cause it than projection systems

low resolution, low frame rate and high latency are also likely causes

Another hypothesis deals with the lack of a rest frame. When a user views images on a screen with an obvious border that border locates the user in the real world. Without that border the user loses his/her link to the real world and the affects of motion in the virtual world are more pronounced.

Fighter pilots have 20 to 40 percent sickness rates in flight simulators - but experienced pilots get sick more often than novice pilots.

In a rotating field when walking forward, people tilt their heads and feel like they are rotating in the opposite direction.

If a person is walking on a treadmill holding onto a stationary bar and you change the rate the the visuals are passing by, it will feel to the person like the bar is pushing or pulling on their hands.

This all affects the kinds of worlds you create and how long a person can safely stay in that world.

Its easy (and fun) to induce vertigo. Most people really seem to enjoy jumping off of high places and falling in VR.

Open fields are less likely to cause problems than walking through tight tunnels; tunnels are very aggressive  in terms of peripheral motion. This doesn't mean that you should have any tunnels, but you should be careful how much time the users spend there.

Pokemon Incident

December 16 1997

685 schoolchildren taken to hospitals- feeling sick while watching Pokemon

12 Hz red - blue flicker scene lasting about 5s roughly 20mins into the program

Show aired in several major cities (Tokyo, Osaka, etc) and then excepts were shown on the nightly news after reports came in - causing more cases. Broadcast of the show was cancelled in 30 other cities.

Pokemon incident was the first occurrence on a mass scale

New type of trigger, not just rapid light/dark - this is now known as "chromatic sensitive epilepsy."

Pokemon on the brian

Audio in virtual environments can be used for several purposes:

- audio can help the virtual environment give the user feedback that they have 'touched' something, that they have activated a menu item, that a new user has entered the space

- ambient audio can help set the mood of the virtual environment, and make the environment seem more real

- audio can also be used to send speech between multiple participants, but we will talk more about that in a future lecture

Simple audio can be monaural with a single speaker but there are many advantages to having multiple speakers as in a surround sound system to give directional audio where sounds occur form a particular location. As you get close to these sounds they get louder and are more localized. This can tell you where things have happened, or lead you towards something.

Subwoofers can be good for rumbling. Connecting them to a vibrating floor can add additional feedback.

The sounds themselves can be prerecorded clips that are played back or looped, or sounds can be synthesized. Synthesized sounds can be useful in scientific environments to give feedback on the current state of the world

Its important to have high-quality prerecorded clips - background hiss can be very noticeable.

You will usually want to play back multiple sounds at the same time - some of these will be looping environmental sounds and others will be sounds played when certain events occur. There are several free audio libraries out there that can handle these things in a pretty straightforward way.

Its important to balance all of these sounds so that some sounds do not unintentionally hide others. You also need to be careful that you do not overload the speakers.

You also need to be careful that you are not playing too many sounds at the same time, or playing the same sound too many times. For example when there are multiple people running, or multiple water droplets hitting a pool, it is probably a bad idea to play a sound for each of those events or you will just hear noise. One method is to only play an event sound if it hasn't been played in the previous n seconds.

Its usually a good idea to load all of your sounds in at the beginning of the program and store them in memory. Any repeating sound can be set to repeat and play at zero volume, and then faded up when needed and faded down when not needed.

As with visuals its important to audition your audio in the environment to make sure they work.

### For Thursday

• Cruz-Neira, C., Sandin, D., DeFanti, T., Kenyon, R., Hart, J., The CAVEŽ: Audio Visual Experience Automatic Virtual Environment, Communications of the ACM, 06/01/1992 link

Tracking