The Rise and Fall and Rise and Fall of Virtual Reality
14 mins read

The Rise and Fall and Rise and Fall of Virtual Reality

How Will Apple’s Vision Pro Pan Out?

Naturally I was curious when Apple launched its long-awaited virtual reality system, which it named Vision Pro. Rumors had been around for years. This was it.

This new product had to live up to Apple’s very high quality standards. It had to be a breakthrough in interface, imaging, and miniaturization. It had to overcome a myriad of engineering challenges in data integration, computational graphics, workflow, and application design. It had to create a new paradigm that the rest of the industry will copy, as they always do.

It wasn’t until recently, however, after the hype and the crowds died down, that I signed up for an in-depth demo. Does Vision Pro do all those things? The answer is yes, it does do all those things. But the question remains, is it the next generation of computing and entertainment? I am skeptical.

Could the Vision Pro be the Segue of computer platforms — that over-hyped miracle two-wheeler that was going to be the most important transportation innovation since feet? Segues turned out to be a niche product, valuable in warehouses, shopping malls, tourism, crowd management, and some police patrolling. A brilliant engineering success, but with so many logistical limitations that its mass use was a marketing mirage.

The heart of Vision Pro is stereoscopic space — virtual reality, or sometimes augmented reality. Both are three-dimensional (3D) views of the world, so Vision Pro assumes we want that. But the history and science of artificial 3D is mixed, at best.

A Little History

As I’ve noted elsewhere, stereopsis, or stereoscopic vision, is the only cue to spatial depth that cannot be rendered in two dimensions. It’s all in your head. The only way to create it artificially is to show each eye a different image, taken a few inches apart. Your brain does the rest.

The earliest implementation was the stereoscope, familiar to most of us from antique shops and science museums.

The popular 19th Century Holmes stereoscope

It simply places magnifying lenses in front of each eye, focused on twin photos a few inches away. The lenses direct each eye to the center of each picture.

You can duplicate a stereoscope without any equipment, just using your smartphone and a little practice. Take two pictures of an interesting 3D scene (say 5 to 50 feet away) using the “cha-cha” method. Take a vertical picture with your weight on your left foot, then switch your weight to the right foot and take another. That moves the camera about the right distance. Keep the phone aimed straight forward. Then look at the pictures in gallery view, horizontally:

The lower right images are a left-eye, right-eye pair, as seen in my iPhone.

With the images paired up left-right as shown, you then look “through” them, gazing at a point some feet behind the screen. Your eyes should diverge as they uncross to look further away. Like the lenses in a stereoscope, you can make each eye look at a different image. The trick is to simultaneously focus on the little pictures while your eyes are trying to look into the distance. This is what geologists learn in school to view stereo pairs of land formations. If you aren’t successful looking into the distance, take the pictures in the opposite order (right then left) and cross your eyes to view.

Result: eyestrain.

But you see 3D!

All 3D movies use the same basic idea. They project pairs of images on the screen and use some sort of technology, invariably involving special glasses, to make each eye see only one of them.

The oldest method was anaglyph imagery, in which the two images are filtered through opposite-colored glasses, usually red and cyan. The earliest known use was in Poland in 1853. Other techniques include polarizing filters, in which each lens sees an opposite polarization, and Active Shutter systems, in which heavy glasses use LED technology to black out each lens in sequence, so each eye sees every other frame.

Early 3D used two projectors in sync, one for the left eye and one for the right. Since movie prints frequently broke during projection, they had to be repaired by the projectionist, usually losing a frame or two. But 3D meant that both of the reels had to remain matched. If one print broke, the projectionist had to make a matching break in the other. If the projection were even one frame off sync, the movie became unwatchable and the audience got severe eyestrain. I believe that Active Shutter technology produces some of the same strain, but I’m not sure. In any case, the eyes aren’t seeing matched pairs at the same time.

Even when projected successfully, 3D movies have other limitations, perceptual, financial, and artistic.

  • Depth of focus: Part of our depth perception comes from our eyes darting around a scene, focusing on near and far objects in succession. Everything looks in focus. That’s why we have little real-life experience of using focus to represent depth — shallow focus and bokeh are artifacts of photography. 3D movies must support our instinctive eye movement, so the whole frame has to be sharp. 3D therefore needs small apertures and shorter lenses for each shot, curtailing the director’s ability to use focus to direct our attention.
  • Conflicting real and perceived depth: in 3D, the audience is looking at the screen at a constant distance, but the apparent distance to objects changes, especially when pies are thrown, or lions or zombies jump out of the shadows. This causes eyestrain, since the brain thinks it should change focus when it shouldn’t. This is similar to the eyestrain viewing stereo pairs on your phone — the convergence of your eyeballs and their focus get out of sync.
  • Dimness: The Polaroid process used most widely today dims the picture substantially, more than one photographic stop. Same goes for Active Shutter.
  • Rigidity of camera position: In real life, our heads and bodies move around, changing our viewpoint. Watching a 3D movie is like having your head strapped to a board — your eyes can move around the picture, but your point of view is chained to the camera.
  • Discomfort: The glasses you must wear are an uncomfortable bother, especially if you must also wear your own glasses.
  • General eyestrain, even nausea: See all of the above.
  • Cost: 3D is more expensive to produce, which studios don’t like. Projectors are more expensive, which the theatres don’t like. As a consequence, tickets are more expensive, which audiences don’t like. Win-win-win?

I think the most important drawbacks of 3D are artistic, however. 3D doesn’t add much of anything artistically, as Roger Ebert emphasized in Why I hate 3D movies. “A great film completely engages our imaginations. What would Fargo gain in 3D? Precious? Casablanca?”. Ebert felt that 3D could actually be a distraction, particularly mentioning that directors can’t use focus to guide the viewer’s attention.

3D doesn’t even add much visually. As Christopher Nolan put it in a 2010 interview, “”I think it’s a misnomer to call it 3D versus 2D. The whole point of cinematic imagery is it’s three dimensional… You know 95% of our depth cues come from occlusion, resolution, color and so forth, so the idea of calling a 2D movie a ‘2D movie’ is a little misleading.”

Alfred Hitchcock filmed Dial M for Murder in 3D and was so displeased with the result that he released it first in 2D. He never did another.

The proof is in the numbers. Here is a graph of the number of 3D movies released over the last 100 years.

Source: Stephen Follows.com

The history of 3D cinema is like Samuel Johnson’s definition of second marriages: the triumph of hope over experience. When the movie industry feels threatened by a new technology, like television or streaming, it comes up with a counter technology designed to drag customers back to the theater. Generations of Hollywood suits seem to have no institutional memory.

It. Doesn’t. Work.

Of course there are exceptions. Take James Cameron’s Avatar, the highest-grossing movie of all time. But this was a novelty film, and so are the sequels. The genres most successful in 3D are documentaries, slapstick comedy, horror, CGI sci fi and action films, and, of course, sex. Any genre that puts physical immediacy above narrative.

3D television had a boomlet in the early 2000s, and promptly went away. I think I even owned a set. The feature came with a Google TV I bought, but I never even set it up. The salesman hyped the Netflix and Hulu apps. He never mentioned 3D.

So What About Vision Pro?

I showed up on time for my demo at the beautiful Carnegie Library Apple Store in Washington DC. They checked me in. I waited a few minutes to be summoned to a circle of upholstered benches for the demo.

My guide Alexander sat me down and had me stare into his iPhone at a square target. He told me to move my head around to measure something, then he ordered the appropriate headset inserts with a few taps on his phone. A couple of minutes later my headset arrived, fully configured. While we waited, Alexander explained the basic moves: just look at something to select it (amazing!), tap your thumb and index finger together to click. Pinch thumb and finger, sweep, and release to scroll. And that’s about it. A little practice and it comes quite naturally. There is a crown, similar to the Apple Watch crown, above your right eye. Press to go Home. Turn it to… I guess I forgot what. Maybe sound volume.

The headset weighs 22 ounces. Not terrible, but by the end I was tired of wearing it. Turn it on and you are looking at the surrounding room through cameras built into the set. It’s remarkably realistic — a little dimmer, but the visual match with the outside world seems perfect, like looking through a pane of slightly frosted glass.

But as you get into it, many dimensions emerge. One use of Vision Pro is as a conventional computer. Applications appear to hang before you in space, set against any background you want.

Meh. I’m old school. If I’m plugging away at Excel, I want my 27-inch monitor and a mouse. I don’t need Excel to hover in front of the mountains of the moon with me waving my hands at it.

Virtual reality, not computing, is the essence of Vision Pro. It is what I came to see. It comes in two flavors: Spatial Video and Immersive Video.

Spatial Video puts third-party 2D or 3D content up on the screen in its original format — a picture set within the wider 3D world of the Vision Pro. You can watch a 2D movie as if you were sitting in a theater. (It even lets you select your place in the audience,) Or you could watch a homemade 3D movie — such as one taken with your iPhone held sideways:

With dedicated apps, your iPhone can take 3D movies by using two cameras at once..

The problem with iPhone 3D is that the lenses are only 3/4″ apart, a weak substitute for real interocular distance (2.5 inches). Since the cameras don’t match in resolution or focal length, a little computational whiz-bang is needed behind the scenes, and it detracts from quality. You can also take Spatial Videos with the Vision Pro, where the interocular distance is normal. But who would? I can’t imagine walking around in public looking like a robot from a low-budget sci fi movie.

Watching spatial video is just fine, except (1) I almost never watch movies by myself and (2) Spatial Video is constrained by all the 3D shortcomings I discussed before.

Apple’s Immersive Video is another thing, however. Fantastic.

Immersive Video covers 160 degrees horizontally and I don’t know how many degrees vertically. It’s created using matching 8K cameras, one for each eye. The resolution, contrast, and brilliance are breathtaking. It covers your entire field of view. The sound is stunning. You are there.

Immersive video seems to solve the problems of traditional 3D. The demos I saw were dynamic, with the camera moving as you would move yourself. The field of view is huge, so you don’t seem to be staring down a tube. The eye relief of the entire Vision Pro is great. You are quite comfortable.

In addition to video, Immersive Video does 3D graphics. Apple has a number of demos where the same field of view is given over to free-floating objects — a space station rotating against the stars, a jet engine you can pull apart. You can manipulate the objects and see them from any angle. The potential for training, especially advanced training in fields like surgery, is exciting.

Immersive Video will be wonderful for documentaries, training, and short subjects such as on YouTube. Any application whose purpose is exposition, not narrative.

Leave a Reply

Your email address will not be published. Required fields are marked *