Tag Archive for Google Glass

Mira Prism and Dreamworld AR – (What Disney Should Have Done?)

That Was Fast – Two “Bug-Eye” Headsets

A few days ago  I published a story on the Disney Lenovo Optics and wondered why they didn’t use a much simpler “bug-eye” combiner optics similar to the Meta-2 (below right) which currently sells in a development kit version for $949. It turns out the very same day Mira announced their Prism Headset which is a totally passive headset with a mount for a phone and bug-eye combiners with a “presale price” of $99 (proposed retail $150). Furthermore in looking into what Mira was doing, I discovered that back on May 9th, 2017, DreamWorld announced their “DreamGlass” headset using bug-eye combiners that also includes tracking electronics which is supposed to cost “under $350” (see the Appendix for a note on a lawsuit between DreamWorld and Meta)

The way both of these work (Mira’s is shown on the left) is that the cell phone produces two small images, one for each eye, that reflects off the two curved semi-mirror combiners that are joined together. The combiners reflect part of the phone’s the light and move the focus of the image out in space (because otherwise human could not focus so close).

Real or Not?: Yes Mira, Not Yet Dreamworld

Mira has definitely built production quality headsets as there are multiple reports of people trying them on and independent pictures of the headset which looks to be near to if not a finished product.

DreamWorld has not demonstrated, at least as of their May 9th announcement, have a fully functional prototype per Upload’s article. What may appear to be “pictures” of the headset are 3-D renderings. Quoting Upload:

“Dreamworld’s inaugural AR headset is being called the Dreamworld Glass. UploadVR recently had the chance to try it out at the company’s offices but we were not allowed to take photos, nor did representatives provide us with photographs of the unit for this story.

The Glass we demoed came in two form factors. The first was a smaller, lighter model that was used primarily to show off the headset’s large field of view and basic head tracking. The second was significantly larger and was outfitted with “over the counter” depth sensors and cameras to achieve basic positional tracking. “

The bottom line here is that Mira’s appear near ready to ship whereas DreamWorld still has a lot of work left to do and at this point is more of a concept than a product.

DreamWorlds “Shot Directly From DreamWorld’s AR Glass” videos were shot through a combiner, but it may or may not be through their production combiner configured with the phone in the same place as the production design.

I believe views shown in the Mira videos are real, but they are, of course, shooting separately the people in the videos wearing the heaset and what the image look’s like through the headset. I will get into one significant problem I found with Mira’s videos/design later (see “Mira Prism’s Mechanical Interference” section below).

DreamWorld Versus Mira Optical Comparison

While both DreamWorld and Mira have similar optical designs, on closer inspection it is clear that there is a very different angle between the cell phone display and the combiners (see left). DreamWorld has the combiner nearly perpendicular to the combiner whereas Mira has the cell phone display nearly parallel. This difference in angle means that there will be more inherent optical distortion in the DreamWorld design whereas the Mira design has the phone more in the way of the person’s vision, particularly if they wear glasses (once again, see “Mira Prism’s Mechanical Interference” section below).

See-Through Trade-offs of AR

Almost all see-though designs waste most light of the display in combining the image with the real world light.  Most designs lose 80% to 95% (sometimes more) of the display’s light. This in turn means you want to start with a display 20 to as much as 100 times (for outdoor use) the brightness of a cell phone. So even an “efficient” optical design has serious brightness problems starting with a cell phone display (sorry this is just a fact). There are some tricks to avoid these losses but not if you are starting with the light from a cell phone’s display (broad spectrum and very diffuse).

One thing I was very critical of last time of the Disney-Lenova headset was that it appeared to be blocking about 75 to 80% of the ambient/real-world light which is equivalent to dark sunglasses. I don’t think any reasonable person would find blocking this much light to be acceptable for something claiming to be “see-through” display.

From several pictures I have of Mira’s prototype, I very roughly calculated that they are about 70% transparent (light to medium dark sunglasses) which means they in turn are throwing away 70+% of the cell phone’s light. On of the images from from Mira’s videos is shown below. I have outlined with a dashed line the approximate active FOV (the picture cuts it off on the bottom) which Mira claims to cover about 60 degees and you can see the edge of the combiner lens (indicated by the arrows).

What is important to notice is that the images are somewhat faded and don’t not “dominate”/block-out the real world. This appears true of all the through optics images in Mira’s videos. The room while not dark is also not overly brightly lit. This is going to be a problem for any AR device using a cell phone as its display. With AR optics you are both going to throw away a lot of the displays light to support seeing through to the real world and you have to compete with the light that is in the real world. You could turn the room lights out and/or look at black walls and tables, but then what is the point of being “see through.”

I also captured a through the optics image from DreamWorld’s DreamGlass video (below). The first thing that jumps out at me is how dark the room looks and that they have a very dark table. So while the images may look more “solid” than in the Mira video, most of this is due to the lighting of the room

Because the DreamWorld background is darker, we can also see some of the optical issues with the design. In particular you should notice the “glow” around the various large objects (indicated by red arrows). There is also a bit of a double image of the word “home” (indicated by the green arrow). I don’t have an equivalent dark scene from Mira so I can’t tell if they have similar issues.

Mira Prism’s Resolution

Mira (only) supports the iPhone 6/6s/7 size display and not the larger “Plus” iPhones which won’t fit. This gives them 1334 by 750 pixels to start with. The horizontal resolution first has to be split in half and then about 20% of the center is used to separate the two images and center the left and right views with respect to the person’s eye (this roughly 20% gap can be seen in Mira’s Video). This nets about (1334/2) X 80% = ~534 pixels horizontally. Vertically they may have slightly higher resolution of about 600 pixels.

Mira claims a FOV of “60 Degrees” and generally when a company does not specify the whether it is horizontal, vertical, or diagonal, they mean diagonal because it is the bigger number. This would suggest that the horizontal FOV is about 40 and the vertical is about 45 degrees. This nets to a rather chunky 4.5 arcminutes/pixel (about the same as Oculus Rift CV1 but with a narrower FOV). The “screen door effect” of seeing the boundaries between pixels is evident in Mira’s videos and should be noticeable when wearing.

I’m not sure that supporting a bigger iPhone, as in the Plus size models would help. This design requires that the left and right images be centered over the which limits where the pixels in the display can be located. Additionally, a larger phone would cause more mechanical interference issues (such as with glasses covered in the next section).

Mira Prism’s Mechanical Interference

A big problem with a simple bug-eye combiner design is the location of the display device. For the best image quality you want the phone right in front of the eye and as parallel as possible to the combiners. You can’t see through the phone so they have to move it above the eye and tilt it from parallel. The more they move the phone up and tilt it, the more it will distort the image.

If you look at upper right (“A”) still frame form Mira’s video below  you will see that the phone his just slightly above the eyes. The bottom of the phone holder is touching the top of the person’s glasses (large arrow in frame A). The video suggest (see frames “B” and “C”) that the person is looking down at something in their hand. But as indicated by the red sight line I have drawn in frames A and B the person would have to be looking largely below the combiner and thus the image would at best be cut-off (and not look like the image in frame C).

In fact, for the person with glasses in the video to see the whole image they would have to be looking up as indicated by the blue sight lines in frames A and B above. The still frame “D” shows how a person would look through the headset when not wearing glasses.

I can’t say whether this would be a problem for all types of glasses and head-shapes, but it is certainly a problem that is demonstrated in the Mira’s own video.

Mira’s design maybe a bit too simple. I don’t see any adjustments other than the head band size. I don’t see any way work around say running into a person’s glasses as happens above.

Cost To Build Mira’s Prism

Mira’s design is very simple. The combiner technology is well known and can be sourced readily. Theoretically, Mira’s Prism should cost about the same to make as a number of so called “HUD” displays that use a cell phone as the display device and a (single) curved combiner that sell for between $20 and $50 (example on right). BTW, these “HUD” are useless in the daylight as a cell phone is just not bright enough. Mira needs to have a bit more complex combiner and hopefully of better quality than some of the so-called “HUDs” so $99 is not totally out of line, but they should be able to make them at a profit for $99.

Conclusions On Simple Bug-Eye Combiner Optics With A Phone

First let me say I have discussed Mira’s Prism more than DreamWord’s DreamGlass above because there is frankly more solid information on the Prism. DreamGlass seems to be more of a concept without tangible information.

The Mira headset is about as simple and inexpensive as one could make an AR see-through headset assuming you can use a person’s smartphone. It does the minimum enabling a person to focus on a phone that is so close and combining with the real world. Compared to say Disney-Lenovo birdbath, it is going to make both the display and real world both more than 2X brighter. As Mira’s videos demonstrate, the images are still going to be ghostly and not very solid unless the room and/or background is pretty dark.

Simplicity has its downsides. The resolution is  low, image is going to be a bit distorted (which can be corrected somewhat by software at the expense of some resolution). The current design appears to mechanical interference problems with wearing glasses. Its not clear if the design can be adapted to accommodate glasses as it would seem to move the whole optical design around and might necessitate a bigger headset and combiners.  Fundamentally a phone is not bright enough to support a good see-through display in even moderately lit environments.

I don’t mean to be overly critical of Mira’s Prism as I think it is an interesting low cost entry product, sort of the “Google Cardboard” of AR (It certainly makes more sense than the Disney_Lenovo headset that was just announced). I would think a lot of people would want to play around with the Mira Prism and find uses for it at the $99 price point. I would expect to see others copying its basic design. Still, the Mira Prism demonstrates many of the issues with making a low cost see-though design.

DreamWorld’s DreamGlass on the surface makes much less sense to me. It should have all the optical limitations of the much less expensive Mira Prism. It it adding at lot of cost on top of a very limited display foundation using a smartphones display.

Appendix

Some History of Bug-Eye Optics

It should be noted that what I refer to as bug-eye combiners optics is an old concept. Per the picture on the left taken from a 2005 Links/L3 paper, the concept goes back to at least 1988 using two CRTs as the displays. This paper includes a very interesting chart plotting the history of Link/L3 headsets (see below). Links legacy goes all the way back to airplane training simulators (famously used in World War II).

A major point of L3/Link’s later designs,  is that they used corrective optics between the display and the combiner to correct for the distortion cause by the off-axis relationship between the display and the combiner.

Meta and DreamWorld Lawsuit

The basic concept of dual large combiners in a headset obviously and old idea (see above), but apparently Meta thinks that DreamWorld may have borrowed without asking a bit too much from the Meta-2. As reported in TechCrunch, “The lawsuit alleges that Zhong [Meta’s former Senior Optical Engineer] “shamelessly leveraged” his time at the company to “misappropriate confidential and trade secret information relating to Meta’s technologies”.

Addendum

Holokit AR

Aryzon AR

There are at least two other contenders for the title of “Google Cardboard of AR.” Namely the Aryzon and Holokit which both separate the job of the combiner from the focusing. Both put a Fresnel lens in between the phone and a flat semitransparent combiner. These designs are one step simpler/cheaper (and use cardboard for the structure) than Mira’s design, but are more bulky with the phone hanging out. An advantage of these designs is that everything is “on-axis” which means lower distortion, but they have chromatic aberrations (color separation) issues with the inexpensive Fresnel lenses that the Mira’s mirror design won’t have. There also be some Fresnel lens artifact issues with these designs.

Disney-Lenovo AR Headset – (Part 1 Optics)

Disney Announced Joint AR Development At D23

Disney at their D23 Fan Convention in Anaheim on July 15th, 2017 announced an Augmented Reality (AR) Headset jointly developed with Lenovo. Below is a crop and brightness enhanced still frame capture from Disney’s “Teaser” video.

Disney/Lenovo also released a video from a interview a the D23 convention which gave further details. As the interview showed (see right), the device is based on using a person’s cell phone as the display (similar to Google cardboard and Samsung’s Gear for VR).

Birdbath Optics

Based on analyzing the two videos plus some knowledge of optical systems, it is possible to figure out what they are doing in terms of the optical system. Below is a diagram of what I see them as doing  in terms of optics (you may want to open this in a separate widow to view this figure in the discussion below).

All the visual evidence indicates that Disney/Lenovo  using a classical “birdbath” optical design (discussed in an article on March 03, 2017). The name “birdbath” comes from the used of a spherical semi-mirror with a beam splitter directing light into the mirror. Birdbath optics are used because they are relatively inexpensive, lightweight, support a wide field of view (FOV), and are “on axis” for minimal distortion and focusing issues.

The key element of the birdbath is the curve mirror which is (usually) the only “power” (focus changing) element. The beauty of mirror optics is that they have essentially zero chromatic aberrations whereas is is difficult/expensive to reduce chromatic aberrations with lens optics.

The big drawbacks of birdbath optics include that they block a lot of light both from the display device and the real world and double images from unwanted reflections of “waste” light. Both these negative effects can be seen in the videos.

There would be no practical way (that I know of) to support a see-though display with a cell phone sized display using refractory (lens) optics such as used with Google Cardboard or the Oculus Rift. The only practical ways I know for supporting AR/see-through display using a cell phone size display all use curved combiner/mirrors..

Major Components

Beam Splitter – The design uses a roughly 50/50 semi-mirror beam splitter which has a coating (typically aluminum alloy although it is often called “silver”) that lets about 50 percent of the light through while acting like a mirror for 50% of the light. Polarizing beam splitters would be problematic with using most phones and are much more expensive. You should note that the beam splitter is arranged to kick the image from the phone toward the curved combiner and away from the person’s eyes; thus light from the display is reflected and then has a transmissive pass.

Combiner – The combiner, a spherical semi-mirror is the key to the optics and multiple things. The combiner appears to also be about 50-50 transmissive-mirror. The curved mirror’s first job is to all the user for focus on the phones display which otherwise would be too close to a person’s eyes to support comfortable focusing. The other job of the combiner is to combine the light/image from the “real world” with the display light; it does this with the semi-mirror allowing light from the image to reflect and light from the real world be be directed toward the eye. The curve mirror only has a signification optical power (focus) effect on the reflected display light and very little distortion of the real world.

Clear Protective Shield

As best I can tell from the two videos, the shield is pretty much clear and serves no function other than to protect the rest of the optics.

Light Baffles Between Display Images

One thing seen in the picture at top are some back stepped light baffles to keep light cross-talk down between the eye.

Light Loss (Follow the Red Path)

A huge downside of the birdbath design is the light loss as illustrated in the diagram by the red arrow path where the thickness of the arrows are roughly to scale with the relative amount of light. To keep things simple, I have assumed no other losses (there are typically 2% to 4% per surface).

Starting with 100% of the light leaving the phone display, about 50% of goes through the beam splitter and is lost while the other 50% is reflected to the combiner. The combiner is also about 50% mirrored (a rough assumption), and thus 25% (0.5 X 0.5) of the display’s light has its focus changed and reflected back toward the beam splitter. About 25% of the light also goes through the combiner and causes the image you can see in the picture on the left. The beam splitter in turn allows 50% of the 25% or only about 12.5% of the light to pass toward the eye. Allowing for some practical losses, less than 10% of the light from the phone makes it to the eye.

Double Images and Contrast Loss (Follow the Green Dash Path)

Another major problem with the birdbath optics is that the lost light will bounce around and cause double images and losses in contrast. If you follow the green path, like the red path about 50% of the light will be reflected and 50% will pass through the beamsplitter (not shown on the green path). Unfortunately, a small percentage of the light that is supposed to pass through will be reflected by the glass/plastic to air interface as it tries to exit the beamsplitter as indicated by the green and red dashed lines (part of the red dashed line is obscured). This dashed path will end up causing a faint/ghost image that is offset by thickness of the beamsplitter tilted at 45 degrees. Depending on coatings, this ghost image could be from 1% to 5% of the brightness of the original image.

The image on the left is a crop from a still frame from the video Disney showed at the D23 conference with red arrows I added pointing to double/ghost images (click here for the uncropped image). The demo Disney gave was on a light background and these double images would be even more noticeable on a dark background. These same type of vertically offset double image could be seen in the Osterhaut Design Group (ODG) R8 and R9 headsets that also use a birdbath optical path (see figure on the right).

A general problem with the birdbath design is that there is so much light that is “rattling around” in an optical wedge formed by the display surface (in this case the phone), beamsplitter, and combiner mirror. Noted in the diagram that about 12.5% of the light returning from the combiner mirror reflected off the beam splitter is heading back toward the phone. This light is eventually going to hit the front glass of the phone and while much of it will be absorbed by the phone, some of it is going to reflect back, hit the beam splitter and eventually make it to the eye.

About 80% of the Real World Light Is Blocked

In several frames in the D23 interview video it was possible to see through the optics and make measurements as to the relative brightness looking through and around the optics. This measurement is only rough and and it helped to take it in several different images. The result was that about a 4.5 to 5X difference in brightness looking through the optics.

Looking back at the blue/center line in the optical diagram, about 50% of the light is blocked by the partial mirror combiner and then 50% of that light is block by the beam splitter for a net of 25%. With other practical losses including the shield, this comes close to the roughly 80% (4/5ths) of the light being block.

Is A Cell Phone Bright Enough?

For Movies in a dark room ANSI/SMPTE 196M spec for movies recommends about about 55 nits in a dark room. A cell phone typically has from 500 to 800 peak nits (see Displaymate’s Shootouts for objective measurements), but after about a 90% optical loss the image  would be down to between about 50 and 80 nits, which is possible just enough if the background/room is dark. could be acceptably bright in a moderately dark room.  But if the room light are on, this will be at best marginal even after allowing for the headset blocking about 75 to 80% of the room light between the combiner and the beam splitter.

With AR you are not just looking at a blank wall. To make something look “solid” non/transparent the display image needs to “dominate” by being at least 2X brighter than anything behind it. It becomes even more questionable that there is enough brightness unless there is not a lot of ambient light (or everything in the background is dark colored or the room lights are very dim).

Note, an LCOS or DLP based see-through AR systems can start with about 10 to 30 times or more the brightness (nits) of a cell phone. They do this so they can work in a variety of light conditions after all the other light losses in a system.

Alternative Optical Solution – Meta-2 “Type”

Using a large display like a cell phone rather than microdisplay severely limits the optical choices with a see-through display. Refractive (lens) optics, for example, would be huge and expensive or Fresnel optics with their optical issues.

Meta-2 “Bug-Eye” Combiners

The most obvious alternative to the birdbad would be to go with dual large combiners such as the Meta-2 approach (see left). When I first saw the Disney-Lenovo design, I even thought it might be using the Meta-2 approach (disproven on closer inspection). With Meta-2, the beam splitter is eliminated and two much larger semi-circular combiners (givening a “bug-eye” look) have a direct path to the display.  Still the bug-eyed combiner is not that much larger than the shield on the Disney-Lenovo system. Immediately, you should notice how the user’s eyes are visible which shows how much more light is getting through..

Because there is no beamsplitter, the Meta-2 design is much more optically efficient. Rough measurements from pictures suggest the Meta-2’s combiners pass 60% and thus reflects about 40%. This means with the same display, it would make the display appear 3 to 4 times brighter while allowing about 2.5X of the real world light through as that of the Disney-Lenovo birdbath design.

I have not tested a Meta-2 nor have read any serious technical evaluation (just the usual “ooh-wow” articles), and I have some concerns with the Meta design. The Meta-2 is “off-axis” in that the display is not perfectly perpendicular to the the combiner. One of the virtues of the birdbath is that is it results in a straightforward on-axis design. With the off-axis design, I wonder how well the focus distance is controlled across the FOV.

Also, the Meta-2 combiners are so far from the eye, that a persons two eyes would have optical cross-talk (there is nothing to keep the one eye from seeing what the other eye is seeing such as the baffels in the Disney-Lenovo design). I don’t know how this would affect things in stereo use, but I would be concerned.

In terms of simple image quality, I would think it would favor the single bug-eye style combiner. There are are no secondary reflections caused by the beamsplitter and both the display and the real world would be significantly brighter. In terms of cost, I see pro’s and con’s relative to each design and overall not a huge difference assuming both designs started with a cell phone displays. In terms of weight, I don’t see much of a difference either.

Conclusions

To begin with, I would not expect even good image quality out of a phone-as-a-display AR headset. Even totally purpose built AR display have their problems. Making a device “see-through” generally makes everything more difficult/expensive.

The optical design has to be compromised right from the start to support both LCD and OLED phones that could have different sizes. Making matters worse is the birdbath design with its huge light losses. Add to this the inherent reflections in the birdbath design and I don’t have high hopes for the image quality.

It seems to me a very heavy “lift” even for the Disney and Star Wars brands. We don’t have any details as to the image tracking and room tracking but I would expect like the optics, it will be done on the cheap. I have no inside knowledge, but it almost looks to me that the solution was designed around supporting the Jedi Light Saber shown in the teaser video (right). They need the see-through aspect so the user can see the light saber. But making the headset see-through is a long way to go to support the saber.

BTW, I’m a big Disney fan from way back (have been to the Disney parks around the world multiple times, attended D23 conventions, eaten at Club 33, was a member of the “Advisory Council” in 1999-2000, own over 100 books on Disney, and the one of the largest 1960’s era Disneyland Schuco monorail collections in the world ). I have an understanding and appreciation of Disney fandom, so this is not a knock on Disney in general.

Varjo Foveated Display Part 2 – Region Sizes

Introduction

As discussed in Part 1, the basic concept of foveated display in theory should work to provide high angular resolution with a wide FOV. There is no single display technology today for near-to-eye displays. Microdisplays (LCOS, DLP, and OLED) support high angular resolution but not wide FOV and larger flat panel displays (OLED and LCD) support wide FOV but with low angular resolution.

The image above left includes crops from the picture on Varjo’s web site call “VR Scene Detail” (toward the end of this article is the whole annotated image).  Varjo included both the foveated and un-foveated image from the center of the display. The top rectangle in red it taken from the top edge of the picture where we can just see the transition starting from the foveated image to what Varjo calls the “context” or lower resolution image. Blending is used to avoid an abrupt transition that the eye might notice.

The topic foveated gathered addition interest with Apple’s acquisition of the eye tracking technology company SMI which provided the eye tracking technology for Nvidia’s foveated rendering HMD study (see below). It is not clear at this time why Apple bought SMI, it could be for foveated rendering (f-rendering) and/or foveated display (f-display).

Static Visual Acuity

The common human visual acuity charts (right) give some feel for the why foveation (f-rendering and/or f-display) works. But these graphs are for static images of high contrast black and white line pairs. While we commonly talk about a person normally seeing down to 1 arcminute per pixel (300 dpi at about 10 inches) being good, but people can detect down to about 1/2 arcminute and if you have a long single high contrast line down to about 1/4th of an arcminute. The point here is to understand that these graphs are a one-dimensional slice of a multi-dimensional issue.

For reference, Varjo’s high resolution display has slightly less than 1-arminute/pixel and their context display in their prototype has about 4.7-arcminutes/pixel. More importantly, their high resolution display covers about 20 degrees horizontally and 15 degrees vertically and this is within the range where people could see errors if they are high in contrast based on the visual acuity graphs.

Varjo will be blending to reduce the contrast difference and thus make the transition less noticeable. But on the negative side, with any movement of the eyes, the image on the foveated display will change and the visual system tends to amplify any movement/change.

Foveated Rendering Studies

Frendering, varies the detail/resolution/quality/processing based on where the eyes are looking. This is seen as key in not only reducing the computing requirement but also saving power consumption. F-rendering has been proven to work with many human studies including those done as part of Microsoft’s 2012  and Nvidia’s 2016 papers. F-rendering becomes ever more important as resolution increases.

F-rendering uses a single high resolution display and change the level of rendering detail. It then uses blending between various detail levels to avoid abrupt changes that the eye detect. As the Microsoft and Nvida papers point out, the eye is particularly sensitive to changes/movement.

In the case of the often cited Microsoft 2012, they used 3 levels of detail with two “blend masks” between them as illustrated in their paper (see right). This gave them a very gradual and wide transition, but 3 resolution levels with wide bands of transition are “luxuries” that Varjo can’t have. Varjo only has two possible levels of detail, and as will be shown, they can only afford a narrow transition/bends region. Microsoft 2012 study used only 1920×1080 monitor with a lower resolution central region than Varjo (about half the resolution) and then 3 blending regions that are so broad that that they would be totally impractical for f-display.

Nvidia’s 2016 study (which cites Microsoft 2012) simplified to two levels of detail, fovea and periphery, with a sampling factor of 1 and 4 with a simpler linear blending between the two detail levels. Unfortunately, most of Nvidia’s study was done with a very low angular resolution Oculus headset display with about a 4.7 arcminutes/pixel with a little over 1,000 by 1,000 pixels per eye, the same display as Varjo uses for their low resolution part of the image. Most of the graphs and discussion in the paper was with respect to this low angular resolution headset.

Nvidia 2016 also did some study of a 27″ (diagonal) 2560×1440 monitor with the user 81cm way resulting in an angular resolution of about 1-arcminute and horizontal FOV of 40 degrees which would be more applicable to Varjo’s case. Unfortunately, As the paper states on their user study, “We only evaluate the HMD setup, since the primary goal of our desktop study in Section 3.2 was to confirm our hypothesis for a higher density display.” They only clue they give for the higher resolution system is that, “We set the central foveal radius for this setup to 7.5°.” There was no discussion I could find for how they set the size of the blend region; so it is only a data point.

Comment/Request: I looked around for a study that would be more applicable to Varjo’s case. I was expecting to find a foveated rendering study using say a 4K (3840×2160) television which would support 1 arcminute for 64 by 36 degrees but I did not find it. If you know of such a study let me know.

Foveated Rending is Much Easier Than Foveated Display

Even if we had a f-rendering study of an ~1-arcminute peak resolution system, it would still only give us some insight into the f-display issues. F-rendering, while conceptually similar and likely to to be required to support a f-display (f-display), is significantly simpler.

With f-rendering, everything is mathematical beyond the detection of the eye movement. The size of the high resolution and lower resolution(s) and the blend region(s) can be of arbitrary size to reduce detection and even be dynamic based on contend. The alignment between resolutions is perfectly registered. The color and contrast between resolutions is identical. The resolution of rendering of the high resolution area does not have to scaled/re-sampled to match the background.

Things are much tougher for f-display as there are two physically different displays and the high resolution display has to be optically aligned/moved based on the movement of the eye. The alignment of the display resolution(s) limited by the optics ability to move the apparent location of the high resolution part of the image. There is likely to be some vibration/movement even when aligned. The potential size of the high resolution display as well as the size of the transition region is limited by the size/cost of the microdisplay used. There can be only a single transition. The brightness, color, and contrast will be different between the two physically different displays (even if both are say OLED, the brightness and colors will not be exactly the same). Additionally, the high resolution display’s image will have to be remapped after any optical distortion to match the context/peripheral image; this will both reduce the effective resolution and will introduce movement into the highest resolvable (by the eye) part of the FOV as the foveated display tracks the eye on what otherwise should be say a stationary image.

When asked, Varjo has said that they more capable systems in the lab than the fixed f-display prototype they are showing. But they stopped short of saying whether they have a full up running system and have provide no results of any human studies.

The bottom line here, is that there are many more potential issues with f-display that could prove to be very hard if not practically impossible to solve. A major problem being getting the high res. image to optically move and stop without the eye noticing it. It is impossible to fully understand how will it will work without a full-blown working system and a study with humans and a wide variety of content and user conditions including the user moving their head and reaction of the display and optics.

Varjo’s Current Demo

Varjo is currently demoing a proof of concept system with the foveated/high-resolution image fix and not tracking the center of vision. The diagram below shows the 100 by 100 degree FOV of the current Varjo demonstration system. For the moment at least, let’s assume their next step will be to have a version of this where the center/foveated image moves.

Shown in the figure above is roughly the size of the foveated display region (green rectangle) which covers about 27.4 by 15.4 degrees. The dashed red rectangle show the area covered by the pictures provided by Varjo which does not even fully cover the foveated area (in the pictures they just show the start of the  transition/blending from high to low resolution).

Also shown is a dashed blue circle with the  7.5 degree “central fovial radius” (15 degree diameter) circle of the Nvidia 2016 high angular resolution system. It is interesting that it is pretty close to angle covered vertically by the Varjo display.

Will It Be Better Than A Non-Foveated Display (Assuming Very Good Eye Tracking)?

Varjo’s Foveated display should appear to the human eye as having much higher resolution than an non-foveated display of with the same resolution as Varjo’s context/periphery display. It is certainly going to work well when totally stationary (such as Varjo’s demo system).

My major concern comes (and something that can’t be tested without a full blown system) when everything moves. The evidence above suggests that there may be visible moving noise at the boundaries of the foveated and context image.

Some of the factors that could affect the results:

  1. Size of the foveated/central image. Making this bigger would move the transition further out. This could be done optically or with a bigger device. Doing it optically could be expensive/difficult and using a larger device could be very expensive.
  2. The size of the transition/blur between the high and low resolution regions. It might be worth losing some of the higher resolution to cause a smoother transition. From what I can tell, Varjo a small transition/blend region compared to the f-rendering systems.
  3. The accuracy of the tracking and placement of the foveated image. In particular how accurately they can optically move the image. I wonder how well this will work in practice and will it have problems with head movement causing vibration.
  4. How fast they can move the foveated image and have it be totally still while displaying.
A Few Comments About Re-sampling of the Foveated Image

One should also note that the moving foveated image will by necessity have to be mapped onto the stationary low resolution image. Assuming the rendering pipeline first generates a rectangular coordinated image and then re-samples it to adjust for the placement and optical distortion of the foveated image, the net effective resolution will be about half that of the “native” display due to the re-sampling.

In theory, this re-sampling loss could be avoided/reduce by computing the high resolution image with the foveated image already remapped, but with “conventional” pipelines this would add a lot of complexity. But this type of display would likely in the long run be used in combination with foveated rendering where this may not be adding too much more to the pipeline (just something to deal with the distortion).

Annotated Varjo Image

First, I  want to complement Varjo for putting actual through the optics high resoluion images on their website (note, click on their “Full size JPG version“). By Varjo’s own admission, these pictures were taken crudely with a consumer camera so the image quality is worse than you would see looking into the optics directly. In particular there are chroma aberrations that are clearly visible in the full size image that are likely caused by the camera and how it was use and not necessarily a problem with Varjo’s optics. If you click on the image below, it will bring up the full size image (over 4,000 by 4,000 pixels and about 4.5 megabytes) in a new tab.

If you look at the green rectangle, it corresponds to size of the foveated image in the green rectangle the prior diagram showing the whole 100 by 100 degree FOV.

You should be able to clearly see the transition/blending starting at the top and bottom of the foveated image (see also right). The end of the blending is cutoff in the picture.

The angles give in the figure were calculated based on the known pixel size of the Oculus CV1 display (their pixels are clearly visible in the non-foveated picture). For the “foveated display” (green rectangle) I used Varjo’s statement that it was at least 70 pixels/degree (but I suspect not much more than that either).

Next Time On Foveated Displays (Part 3)

Next time on this topic, I plan on discussion how f-displays may or may not compete in the future with higher resolution single displays.

Near Eye Displays (NEDs): Gaps In Pixel Sizes

I get a lot of questions to the effect of “what is the best technology for a near eye display (NED).” There really is no “best” as every technology has its strengths and weaknesses. I plan to right a few articles on this subject as it is way too big for a single article.

Update 2017-06-09I added the Sony Z5 Premium 4K Cell Phone size LCD to the table. Their “pixel” is about 71% the linear dimension of the Samsung S8 or about half the area but still much larger than any of the microdisplay pixels. But one thing I should add is that most cell phone makers are “cheating” on what they call a pixel. The Sony Z5 Premium’s “pixel” really only has 2/3rds of an R, G, and B per pixel it counts. It also has them in a strange 4 pixel zigzag that causes beat frequency artifacts when displaying full resolution 4K content (GSMARENA’s Close Up Pixtures show of the Z5 Premium fails the show the full resolution in both directions). Note similarly Samsung goes with RGBG type patterns that only have 2/3rd the full pixels in the way they count resolution as well. These “tricks in counting are OK when viewed with the naked eye at beyond 300 “pixels” per inch, but become more problematical/dubious when used with optics to support VR. 

Today I want to start with the issue of pixel size as shown in the table at the top (you may want to pop the table out into a separate window as you follow this article). To give some context, I have also included a few major direct view categories of displays as well. I have grouped the technologies into the colored bands in the table. I have given the pixel pitch (distance between pixel centers) as well as the pixel area (the square of the pixel pitch assuming square pixels. Then to give some context for comparison I have compared the pitch and area relative to a 4.27-micron (µm) pixel pitch which is about the smallest being made in large volume. Finally there are columns showing how big the pixel would be in arcminutes when view from 25cm (250mm =~9.84inches) which is the commonly accepted near focus point. Finally there is a column showing how much the pixel would have to be magnified to equal 1-arcminute at 25cm which gives some idea about the optics required.

In the table, I tried to use smallest available pixel in a given technology that was being produced with the exception of “micro-iLED” for which I could not get solid information (thus the “?”). In the case of LCOS, the smallest field sequential color (FSC) pixel I know of is the 4.27µm one by my old company Syndiant used in their new 1080p device. For the OLED, I used the eMagin 9.3 pixel and for the DLP, their 5.4 micron pico pixel. I used the LCOS/smallest pixel as the baseline to give some relative comparisons.

One thing that jumps out in the table are the fairly large gaps in pixel sizes between the microdisplays versus the other technologies. For example you can fit over 100 4.27µm LCOS pixels in the area of a single Samsung S8 OLED pixel or 170 LCOS pixels in the area of a the pixel used in the Oculus CV1. Or to be more extreme you can fit over 5,500 LCOS pixels in one pixel of a 55-inch TV pixel.

Big Gap In Near Eye Displays (NEDs)

The main point of comparison for today are the microdisplay pixels which range from about 4.27µm to about 9.6µm in pitch to the direct view OLED and LCD displays in 40µm to 60µm that have been adapted with optics to be used in VR headsets (NEDs). Roughly we are looking at one order of magnitude in pixel pitch and two orders of magnitude in area. Perhaps the most direct comparison is the microdisplay OLED pixel at 9.3 microns versus the Samsung S8 at 4.8X linear and a 23x area difference.

So why is there this huge gap? It comes down to making the active matrix array circuitry to drive the technology. Microdisplays are made on semiconductor integrated circuits while direct view displays are made on glass and plastic substrates using comparatively huge and not very good transistor. The table below based on one in an article from 2006 by Mingxia Gu while at Kent State University (it is a little out of date, but gives lists the various transistors used in display devices).

The difference in transistors largely explains the gap. With the microdisplays using transistors made in I.C. fabs whereas direct view displays fabricate their larger and less conductive transistors on top of glass or plastic substrates at much lower temperatures.

Microdisplays

Within the world of I.C.’s, microdisplays used very old/large transistors often using nearly obsolete semiconductor processes. This is both an effort to keep the cost down and the fact that most display technologies need higher voltages than would be supported by smaller transistor sizes.

There are both display physics and optical diffraction reasons which limit making microdisplay pixels much smaller than 4µm. Additionally, as the pixel size gets below about 6 microns, the optical cost of enlarging the pixel to be seen by the human start to escalate so headset optics makers want 6+ micron pixels which are much more expensive to make. To a first order, microdisplay costs in volume are a function of area of the display so smaller pixels means less expensive devices for the same resolution.

The problem for microdisplays is even using old I.C. fabs, the cost per square millimeter is extremely high compared to TFT on glass/plastic, and yields drop as the size of the device grows so doubling the pixel pitch could result in an 8X or more increase in cost. While is sounds good to be using old/depreciated I.C. fabs, it may also mean they may not have the best/newest/highest yielding equipment or worse yet, they close down the facilities as being obsolete.

The net result is that microdisplays are no where near cost competitive with “re-purposed” cell phone technology for VR if you don’t care about size and weight. They are the only way to do a small lightweight headsets and really the only way to do AR/see through displays (save the huge Meta 2 bug-eye bubble).

I hope to pick up this subject more in some future articles (as each display type could be a long article in and of itself. But for now, I want to get onto the VR systems with larger flat panels.

Direct View Displays Adapted for VR

Direct View VR (ex. Oculus, HTC Vive, and Google Cardboard) have leveraged direct view display technologies developed for cell phones. They then put simple optics in front of the display so that people can focus the image when the display is put so near the eye.

The accepted standard for human “near vision” is 25cm/250mm/9.84-inches. This is about as close as a person can focus and is used for comparing effective magnification. With simple (single/few lens) optics you are not so much making the image bigger per say, but rather moving the display closer to the eye and then using the optics to enable the eye to focus. A typical headset uses a roughly 40mm focal length lens and then put the display at the focal lens or less (e.g. 40mm or less) from the lens.  Putting the display at the focal length of the lens makes the image focus at infinity/far away.

Without getting into all the math (which can be found on the web) the result is that with a 40mm focal length nets an angular magnification (relative to viewing at 25cm) of about 6X. So for example looking back at the table at the top, the Oculus pixel (similar in size to the HTC Vive) which would be about 0.77 arcminutes at 25cm end up appearing to cover about 4.7 arcminutes (which are VERY large/chunky pixels) and about a 95 degree FOV (depends on how close the eye gets to the lens — for a great explanation of this subject and other optical issues with the Oculus CV1 and HTC Vive see this Doc-Ok.org article).

Improving VR Resolution  – Series of Roadblocks

For reference, 1 arcminute per pixel is consider near the limit of human vision and most “good resolution” devices try to be under 2 arcminutes per pixel and preferably under 1.5. So let’s say we want to keep the ~95 FOV but improve the angular resolution by 3x linearly to about 1.5 arcminutes, we have several (bad) options:

  1. Get someone to make a pixel that is 3X smaller linearly or 9X smaller in area. But nobody makes a pixel this size that can support about 3,000 pixels on a side. A microdisplay (I.C. based) will cost a fortune (like over $10,000/eye if it could be made at all) and nobody makes transistors that a cheap and compatible with displays that are small enough. But let’s for a second assume someone figures out a cost effective display, then you have the problem that you need optics that can support this resolution and not the cheap low resolution optics with terrible chroma aberrations, god rays, and astigmatism that you can get away with 4.7 arcminute pixels
  2. Use say the Samsung S8 pixel size (a little smaller) and make two 3K by 3K displays (one for each eye). Each display will be about 134mm or about 5.26 inches on a side and the width of the two displays plus the gap between them will end up at about 12 inches wide. So thing in terms of strapping an large iPad Pro in front of your face only, it now has to be about 100mm (~4 inches) in front of the optics (or about 2.5X as far away at on the current headsets). Hopefully you are starting to get the picture, this thing is going to huge and unwieldy and you will probably need shoulder bracing in addition to head straps. Not to mention that the displays will cost a small fortune along with the optics to go with them.
  3. Some combination of 1 and 2 above.
The Future Does Not Follow a Straight Path

I’m trying to outline above the top level issue (there are many more). Even if/when you solve the display cost/resolution problem, lurking behind that is a massive optical problem to sustain that resolution. These are the problems “straight line futurists” just don’t get; they assume everything will just keep improving at the same rate it has in the past not realizing they are starting to bump up against some very non-linear problems.

When I hear about “Moore’s Law” being applied to displays I just roll my eyes and say that they obviously don’t understand Moore’s Laws and the issued behind it (and why it kept slowing down over time). Back in November 2016 Oculus Chief Scientist Michael Abrash made some “bold predictions” that by 2021 we would have 4K (by 4K) per eye and 140 degree FOV with 2 arcminutes per pixel. He upped my example above by 1.33x more pixels and upped the FOV by almost 1.5X which introduces some serious optical challenges.

At times like this I like to point out the Super Sonic Transport or SST of the 1960’s. The SST seemed inevitable for passenger trave, after all in less than 50 years passenger aircraft when from nothing to the jet age; yet today, over 50 years later, passenger aircraft still fly at about the same speed. Oh by the way, in the 1960’s they were predicting that we would be vacationing on the moon by now and having regular fights to Mars (heck, we made it to the moon in less than 10 years). We certainly could have 4K by 4K displays per eye and 140 degree FOV by 2021 in a head mounted display (it could be done today if you don’t care how big it is), but expect it to be more like the cost of flying supersonic and not a consumer product.

It is easy to play arm chair futurist and assume “things will just happened because I want them to happen. The vastly harder part is to figure out how it can happen. I lived through I.C. development in the late 1970’s through the mid 1990’s so I “get” learning curves and rates of progress.

One More Thing – Micro-iLED

I included in the table at the top Micro Inorganic LEDs, also known as just Micro-LEDs (I’m using iLED to make it clear these are not OLEDs). They are getting a lot of attention lately, particularly after Apple bought LuxVue and Oculus bought InfiniLED. These essentially use very small “normal/conventional” LEDs that are mounted (essentially printed) on a substrate. The fundamental issue is that red requires a very different crystal from blue and green (and even they have different levels of impurities). So they have to make individual LEDs and then combine them (or maybe someday grow the dissimilar crystals on the common substrate).

The allure is that iLEDs have some optics properties that are superior to OLEDs. They have tighter color spectrum, more power efficient, can be driven much brighter, less issues with burn in, and in some cases have less diffuse (better collimated) light.

These Micro-iLEDs are being used in two ways, one to make very large displays by companies such as Sony, Samsung, and NanoLumens or supposedly very small displays (LuxVue and InfiniLED). I understand how the big display approach works, there is lots of room for the LED and these displays are very expensive per pixel.

With the small display approach, they seem to have to double issue of being able to cut very small LEDs and effectively “print” the LEDs on a TFT substrate similar to say OLEDs. What I don’t understand is how these are supposed to be smaller than say OLEDs which would seem to be at least as easy to make on similar TFT or similar transistor substrates. They don’t seem to “fit” in near eye, but maybe there is something I am missing at this point in time.

Avegant “Light Field” Display – Magic Leap at 1/100th the Investment?

Surprised at CES 2017 – Avegant Focus Planes (“Light Field”)

While at CES 2017 I was invited to Avegant’s Suite and was expecting to see a new and improved and/or a lower cost version of the Avegant Glyph. The Glyph  was a hardly revolutionary; it is a DLP display based, non-see-through near eye display built into a set of headphones with reasonably good image quality. Based on what I was expecting, it seemed like a bit much to be signing an NDA just to see what they were doing next.

But what Avegant showed was essentially what Magic Leap (ML) has been claiming to do in terms of focus planes/”light-fields” with vergence & accommodation.  But Avegant had accomplished this with likely less than 1/100th the amount of money ML is reported to have raised (ML has raised to date about $1.4 billion). In one stroke they made ML more believable and at the same time raises the question why ML needed so much money.

What I saw – Technology Demonstrator

I was shown was a headset with two HDMI cables for video and USB cable for power and sensor data going to an external desktop computer all bundle together. A big plus for me was that there enough eye relief that I could wear my own glasses (I have severe astigmatism so just diopter adjustments don’t work for me). The picture at left is the same or similar prototype I wore. The headset was a bit bulkier than say Hololens, plus the bundle of cables coming out of it. Avegant made it clear that this was an engineering prototype and nowhere near a finished product.

The mixed reality/see-through headset merges the virtual world with the see-through real world. I was shown three (3) mixed reality (MR) demos, a moving Solar System complete with asteroids, a Fish Tank complete with fish swimming around objects in the room and a robot/avatar woman.

Avegant makes the point that the content was easily ported from Unity into their system with fish tank video model coming from the Monterrey Bay Aquarium and the woman and solar system being downloaded from the Unity community open source library.  The 3-D images were locked to the “real world” taking this from simple AR into be MR. The tracking was not all perfect, nor did I care, the point of the demo was the focal planes, lots of companies are working on tracking.

It is easy to believe that by “turning the crank” they can eliminate the bulky cables and  the tracking and locking to between the virtual and real world will improve. It was a technology capability demonstrator and on that basis it has succeeded.

What Made It Special – Multiple Focal Planes / “Light Fields”

What ups the game from say Hololens and takes it into the realm of Magic Leap is that it supported simultaneous focal planes, what Avegant call’s “Light Fields” (a bit different than true “light fields” to as I see it). The user could change what they were focusing in the depth of the image and bring things that were close or far into focus. In other words, they simultaneously present to the eye multiple focuses. You could also by shifting your eyes see behind objects a bit. This clearly is something optically well beyond Hololens which does simple stereoscopic 3-D and in no way presents multiple focus points to the eye at the same time.

In short, what I was seeing in terms of vergence and accommodation was everything Magic Leap has been claiming to do. But Avegant has clearly spent only very small fraction of the development cost and it was at least portable enough they had it set up in a hotel room and with optics that look to be economical to make.

Now it was not perfect nor was Avegant claiming it to be at this stage. I could see some artifacts, in particularly lots of what looked like faint diagonal lines. I’m not sure if these were a result of the multiple focal planes or some other issue such as a bug.

Unfortunately the only available “through the lens” video currently available is at about 1:01 in Avegant’s Introducing Avegant Light Field” Vimeo video. There are only a few seconds and it really does not demonstrate the focusing effects well.

Why Show Me?

So why were they more they were showing it to me, an engineer and known to be skeptical of demos? They knew of my blog and why I was invited to see the demo. Avegant was in some ways surprising open about what they were doing and answered most, but not all, of my technical questions. They appeared to be making an effort to make sure people understand it really works. It seems clear they wanted someone who would understand what they had done and could verify it it some something different.

What They Are Doing With the Display

While Avegant calls their technology “Light Fields” it is implemented with (directly quoting them) “a number of fixed digital focal planes, and then interpolate the planes in-between them.” Multiple focus planes have many of the same characteristics at classical light fields, but require much less image data be simultaneously presented to the eye and thus saving power on generating and displaying as much image data, much of which the eye will not “see”/use.

They are currently using a 720p DLP per eye for the display engine but they said they thought they could support other display technologies in the future. As per my discussion on Magic Leap from November 2016, DLP has a high enough field rate that they could support displaying multiple images with the focus changing between images if you can change the focus fast enough. If you are willing to play with (reduce) color depth, DLP could support a number of focus planes. Avegant would not confirm if they use time sequential focus planes, but I think it likely.

They are using “birdbath optics” per my prior article with a beam splitter and spherical semi-mirror /combiner (see picture at left). With a DLP illuminated by LEDs, they can afford the higher light losses of the birdbath design and support having a reasonable amount of transparency to the the real world. Note, waveguides also tend to lose/wast a large amount of light as well. Avegant said that the current system was 50% transparent to the real world but that the could make it more (by wasting more light).

Very importantly, a birdbath optical design can be very cheap (on the order of only a few dollars) whereas the waveguides can cost many tens of dollars (reportedly Hololen’s waveguides cost over $100 each). The birdbath optics also can support a very wide field of view (FOV), something generally very difficult/expensive to support with waveguides. The optical quality of a birdbath is generally much better than the best waveguides. The downside of the birdbath compared to waveguides that it is bulkier and does not look as much like ordinary glasses.

What they would not say – Exactly How It Works

The one key thing they would not say is how they are supporting the change in focus between focal planes. The obvious way to do it would with some kind of electromechanical device such as moving focus or a liquid filled lens (the obvious suspects). In a recent interview, they repeatedly said that there were no moving parts and that is was “economical to make.”

What They are NOT Doing (exactly) – Mechanical Focus and Eye/Pupil Tracking

After meeting with Avegant at CES I decided to check out their recent patent activity and found US 2016/0295202 (‘202). It show a birdbath optics system (but with a non-see through curved mirror). This configuration with a semi-mirror curved element would seem to do what I saw. In fact, it is very similar to what Magic Leap showed in their US application 2015/0346495.

Avegant’s ‘202 application uses a combination of a “tuning assembly 700” (some form of electro-mechanical focus).

It also uses eye tracking 500 to know where the pupil is aimed. Knowing where the pupil is aimed would, at least in theory, allow them to generate a focus plane for the where the eye is looking and then an out of focus plane for everything else. At least in theory that is how it would work, but this might be problematical (no fear, this is not what they are doing, remember).

I specifically asked Avegant about the ‘202 application and they said categorically that they were not using it and that the applications related to what they were using has not yet been published (I suspect it will be published soon, perhaps part of the reason they are announcing now). They categorically stated that there were “no moving parts” and that the “did not eye track” for the focal planes. They stated that the focusing effect would even work with say a camera (rather than an eye) and was in no way dependent on pupil tracking.

A lesson here is that even small companies file patents on concepts that they don’t use. But still this application gives insight into what Avegant was interested in doing and some clues has to how the might be doing it. Eliminate the eye tracking and substitute a non-mechanical focus mechanism that is rapid enough to support 3 to 6 focus planes and it might be close to what they are doing (my guess).

A Caution About “Demoware”

A big word of warning here about demoware. When seeing a demo, remember that you are being shown what makes the product look best and examples that might make it look not so good are not shown.

I was shown three short demos that they picked, I had no choice. I could not pick my own test cases.I also don’t know exactly the mechanism by which it works, which makes it hard to predict the failure mode, as in what type of content might cause artifacts. For example, everything I was shown was very slow moving. If they are using sequential focus planes, I would expect to see problems/artifacts with fast motion.

Avegant’s Plan for Further Development

Avegant is in the process of migrating away from requiring a big PC and onto mobile platforms such as smartphones. Part of this is continuing to address the computing requirement.

Clearly they are going to continue refining the mechanical design of the headset and will either get rid of or slim down the cables and have them go to a mobile computer.  They say that all the components are easily manufactureable and this I would tend to believe. I do wonder how much image data they have to send, but it appears they are able to do with just two HDMI cables (one per eye). It would seem they will be wire tethered to a (mobile) computing system. I’m more concerned about how the image quality might degrade with say fast moving content.

They say they are going to be looking at other (than the birdbath) combiner technology; one would assume a waveguide of some sort to make the optics thinner and lighter. But going to waveguides could hurt image quality and cost and may more limit the FOV.

Avegant is leveraging the openness of Unity to support getting a lot of content generation for their platform. They plan on a Unity SDK to support this migration.

They said they will be looking into alternatives for the DLP display, I would expect LCOS and OLED to be considered. They said that they had also thought about laser beam scanning but their engineers objected to trying for eye safety reasons; engineers are usually the first Guinea pigs for their own designs and a bug could be catastrophic. If they are using time sequential focal planes which is likely, then other technologies such as OLED, LCOS or Laser Beam Scanning cannot generate sequential planes fast enough to support that more than a few (1 to 3) focal planes per 1/60th of a second on a single device at maximum resolution.

How Important is Vergence/Accomodation (V/A)?

The simple answer is that it appears that Magic Leap raised $1.4B by demoing it. But as they say, “all that glitters is not gold.” The V/A conflict issue is real, but it mostly affects content that virtually appears “close”, say inside about 2 meters/6 feet.

Its not clear that for “everyday use” there might be simpler, less expensive and/or using less power ways to deal with V/A conflict such as pupil tracking. Maybe (don’t know) it would be enough to simply change the focus point when the user is doing close up work rather than have multiple focal planes presented to the eye simultaneously .

The business question is whether solving V/A alone will make AR/MR take off? I think the answer to this is clearly no, this is not the last puzzle piece to be solved before AR/MR will take off. It is one of a large number of issues yet to be solved. Additionally, while Avegant says they have solved it economically, what is economical is relative. It still has added weight, power, processing, and costs associated with it and it has negative impacts on the image quality; the classic “squeezing the balloon” problem.

Even if V/A added nothing and cost nothing extra, there are still many other human factor issues that severely limit the size of the market. At times like this, I like to remind people the the Artificial Intelligence boom in the 1980s (over 35 years ago) that it seemed all the big and many small companies were chasing as the next era of computing. There were lots of “breakthroughs” back then too, but the problem was bigger than all the smart people and money could solve.

BTW, it you want to know more about V/A and related issues, I highly recommend reading papers and watching videos by Gordon Wetzstein of Stanford. Particularly note his work on “compressive light field displays” which he started working on while at MIT. He does an excellent job of taking complex issues and making them understandable.

Generally Skeptical About The Near Term Market for AR/MR

I’m skeptical that with or without Avegant’s technology, the Mixed Reality (MR) market is really set to take off for at least 5 years (an likely more). I’ve participated in a lot of revolutionary markets (early video game chips, home/personal computers, graphics accelerators, the Synchronous DRAMs, as well as various display devices) and I’m not a Luddite/flat-earther, I simply understand the challenges still left unsolved and there are many major ones.

Most of the market forecasts for huge volumes in the next 5 years are written by people that don’t have a clue as to what is required, they are more science fiction writers than technologist. You can already see companies like Microsoft with Hololens and before them Google with Google Glass, retrenching/regrouping.

Where Does Avegant Go Business Wise With this Technology?

Avegant is not a big company. They were founding in in 2012. My sources tell me that they have raise about $25M and I have heard that they have only sold about $5M to $10M worth of their first product, the Avegant Glyph. I don’t see the Glyph ever as being a high volume product with a lot of profit to support R&D.

A related aside: I have yet to see a Glyph “in the wild” being using say on an airplane (where they would make the most sense). Even though the Glyph and other headsets exist, people given a choice still by vast percentages still prefer larger smartphones and tablets for watching media on the go. The Glyph sells for about $500 now and is very bulky to store, whereas a tablet easily slips into a backpack or other bag and the display is “free”/built in.

But then, here you have this perhaps “key technology” that works and that is doing something that Magic Leap has raised over $1.4 Billion dollars to try and do. It is possible (having not thoroughly tested either one), that Avegant’s is better than ML’s. Avegant’s technology is likely much more cost effective to make than ML’s, particularly if ML’s depends on using their complex waveguide.

Having not seen the details on either Avegant’s or ML’s method, I can’t say which is “best” both image wise and in terms of cost, nor whether from a patent perspective, whether Avegant’s is different from ML.

So Avegant could try and raise money to do it on their own, but they would have to raise a huge amount to last until the market matures and compete with much bigger companies working in the area. At best they have solved one (of many) interesting puzzle pieces.

It seems obvious (at least to me) that more likely good outcome for them would be as a takeover target by someone that has the deep pockets to invest in mixed reality for the long haul.

But this should certainly make the Magic Leap folks and their investors take notice. With less fanfare, and a heck of a lot less money, Avegant has as solution to the vergence/accommodation problem that ML has made such a big deal about.

Near-Eye Bird Bath Optics Pros and Cons – And IMMY’s Different Approach

Why Birdbaths Optics? Because the Alternative (Waveguides) Must Be Worse (and a teaser)

The idea for this article started when I was looking at the ODG R-9 optical design with OLED microdisplays. They combined an OLED microdisplay that is not very bright in terms of nits with a well known “birdbath” optical design that has very poor light throughput. It seems like a horrible combination. I’m fond of saying “when intelligent people chose a horrible design, the alternative must have seemed worse

I’m going to “beat up” so to speak the birdbath design by showing how some fundamental light throughput numbers multiply out and why the ODG R-9 I measured at CES blocks so much of the real world light. The R-9 also has a serious issue with reflections. This is the same design that a number of publications considered among the “best innovations” of CES; it seems to me that they must have only looked at the display superficially.

Flat waveguides such as used by Hololens, Vuzix. Wave Optics, and Lumus as well as expected from Magic Leap get most of the attention, but I see a much larger number of designs using what is known as a “birdbath” and similar optical designs. Waveguides are no secret these days and the fact that so many designs still use the birdbath optics tells you a lot about the issues with waveguides. Toward the end of this article, I’m going to talk a little about the IMMY design that replaces part of the birdbath design.

As a teaser, this article is to help prepare for an article on an interesting new headset I will be writing about next week.

Birdbath Optics (So Common It Has a Name)

The birdbath combines two main optical components, a spherical mirror/combiner (part-mirror) and a beam splitter. The name  “birdbath” comes from the spherical mirror/combiner looking like a typical birdbath. It is used because it generally is comparatively inexpensive to down right cheap while also being relatively small/compact while having  good overall image quality. The design fundamentally supports a very wide FOV, which are at best difficult to support with waveguides. The big downsides are light throughput and reflections.

A few words about Nits (Cd/m²) and Micro-OLEDs

I don’t have time here to get into a detailed explanation of nits (Cd/m²). Nits is the measure of light at a given angle whereas lumens is the total light output. The simplest analogy is to water hose with a nozzle (apropos here since we are talking about birdbaths). Consider two spray patterns, one with a tight jet of water and one with a wide fan pattern both outputting the exact same total amount of water per minute (lumens in this analogy). The one with the tight patter would have high water pressure (nits in this analogy) over a narrow angle where the fan spray would have lower water pressure (nits) over a wider angle.

Additionally, it would be relatively easy to put something in the way of the tight jet and turn it into a fan spray but there is no way to turn the fan spray into a jet. This applies to light as well, it is much easier to go from high nits over are narrow angle to lower nits over a wide angle (say with a diffuser) but you can’t go the other way easily.

Light from an OLED is like the fan spray only it covers a 180 degree hemisphere. This can be good for a large flat panel were you want a wide viewing angle but is a problem for a near eye display where you want to funnel all the light into the eye because so much of the light will miss pupil of the eye and is wasted. With an LED you have a relative small point of light that can be funneled/collimated into a tight “jet” of light to illuminate an LCOS or DLP microdisplay.

The combination of light output from LEDs and the ability to collimate the light means you can easily get tens of thousands of nits with an LCOS or DLP illuminated microdisplay were OLED microdisplays typically only have 200 to 300 nits. This is major reason why most see-through near eye displays use LCOS and DLP over OLEDs.

Basic Non-Polarizing Birdbath (example, ODG R-9)

The birdbath has two main optical components, a flat beam splitter and a spherical mirror. In the case a see-through designs, the the spherical mirror is a partial mirror so the spherical element acts as a combiner. The figure below is taken from an Osterhaut Design Group (ODG) patent which and shows simple birdbath using an OLED microdisplay such as their ODG R-9. Depending on various design requirements, the curvature of the mirror, and the distances, the lenses 16920 in the figure may not be necessary.

The light from the display device, in the case of the ODG R-9 is a OLED microdisplay, is first reflect away from the eye and perpendicular (on-axis) to the curved beam splitter so that a simple spherical combiner will uniformly magnify and move the apparent focus point of the image (if not “on axis” the image will be distorted and the magnification will vary across the image). The curved combiner (partial mirror) has minimal optical distortion on light passing through.

Light Losses (Multiplication is a Killer)

A big downside to the birdbath design is the loss of light. The image light must make two passes at the beam splitter, a reflective and transmissive, with a reflective (Br) and transmissive (Bt) percentages of light. The light making it through both passes is Lr x Lt.  A 50/50 beam splitter might be about 48% reflective and transmissive (with say a 4% combined loss), and the light throughput (Br x Bt) in this example is only 48% x 48%= ~23%. And “50/50” ratio is the best case; if we assume a nominally 80/20 beam splitter (with still 4% total loss) we get 78% x 18% = ~14% of the light making through the two passes.

Next we have the light loss of the spherical combiner. This is a trade-off of image light being reflected (Cr) versus being transmitted  (Ct) to the real world where Cr + Ct is less than 1 due to losses. Generally you want the Cr to be low so the Ct can be high so you can see out (otherwise it is not much of a see through display).

So lets say the combiner has Cr=11% and the Ct=75% with about 4% loss with the 50/50 beamsplitter. The net light throughput assuming a “50/50” beam splitter and a 75% transmissive combiner is Br x Cr X Bt = ~2.5% !!! These multiplicative losses lose all but a small percentage of the display’s light. And consider that the “real world” net light throughput is Ct x Bt which would be 48% x 75% = 36% which is not great and would be too dark for indoor use.

Now lets say you want the glasses to be at least 80% transmissive so they would be considered usable indoors. You might have the combiner Ct=90% making Cr=6% (with 4% loss) and then Bt=90% making Br=6%. This gives the real world transmissive about 90%x90% = 81%.  But then you go back and realize the display light equation (Br x Cr X Bt) becomes 6%x6%x90% = 0.3%. Yes, only 3/1000ths of the starting image light makes it through. 

Why the ODG R-9 Is Only About 4% to 5% “See-Through”

Ok, now back to the specific case of the ODG R-9. The ODG R-9 has an OLED microdisplay that most like has about 250 nits (200 to 250 nits is commonly available today) and they need to get about 50 nits (roughly) to the eye from the display to have a decent image brightness indoors in a dark room (or one where most of the real world light is blocked). This means they need a total throughput of 50/250=20%. The best you can do with two passes through a beam splitter (see above) is about 23%.  This forces the spherical combiner to be highly reflective with little transmission. You need something that reflects 20/23=~87% of the light and only about 9% transmissive. The real world light then making it through to the eye is then about 9% x 48% (Ct x Bt) or about 4.3%.

There are some other effects such as the amount of total magnification and I don’t know exactly what their OLED display is outputting display and exact nits at the eyepiece, but I believe my numbers are in the ballpark. My camera estimates for the ODG R-9 came in a between 4% and 5%. When you are blocking about 95% of the real world light, are you really much of a “see-through” display?

Note, all this is BEFORE you consider adding say optical shutters or something like Varilux® light blocking. Normally the birdbath design is used with non-see through designs (where you don’t have the see-through losses) or with DLP® or LCOS devices illuminated with much higher nits (can be in the 10’s of thousands) for see through designs so they can afford the high losses of light.

Seeing Double

There are also issues with getting a double image off of each face of plate beam splitter and other reflections. Depending on the quality of each face, a percentage of light is going to reflect or pass through that you don’t want. This light will be slightly displaced based on the thickness of the beamsplitter. And because the light makes two passes, there are two opportunities to cause double images. Any light that is reasonably “in focus” is going to show up as a ghost/double image (for good or evil, your eye has a wide dynamic range and can see even faint ghost images). Below is a picture I took with my iPhone camera of a white and clear menu through the ODG R-9. I counted at least 4 ghost images (see colored arrows).

As a sort of reference, you can see the double image effect of the beamsplitter going in the opposite direction to the image light with my badge and the word “Media” and its ghost (in the red oval).

Alternative Birdbath Using Polarized Light (Google Glass)

Google Glass used a different variation of the birdbath design. They were willing to accept a much smaller field of view and thus could reasonably embedded the optics in glass. It is interesting here to compare and contrast this design with the ODG one above.

First they started with an LCOS microdisplay that was illuminated by LEDs that can be very much brighter and more collimated light resulting in much higher (can be orders of magnitude) starting nits than an OLED microdisplay can output. The LED light is passed through a polarizing beam splitter than will pass about 45% P light to the LCOS device (245). Note a polarizing beam splitter passes one polarization and reflect the other unlike a the partially reflecting beam splitter in the ODG design above. The LCOS panel will rotate the light to be seen to S polarization so that it will reflect about 98% (with say 2% loss) of the S light.

The light then goes to a second polarizing beam splitter that is also acting as the “combiner” that the user sees the real world through. This beam splitter is set up to pass about 90% of the S light and reflect about 98% of the P light (they are usually much better/more-efficient in reflection). You should notice that they have a λ/4 (quarter wave = 45 degree rotation) film between the beam splitter and the spherical mirror which will rotate the light 90 degrees (turning it from S to P) after it passes through it twice. This  λ/4 “trick” is commonly used with polarized light. And since you don’t have to look through the mirror, it can be say 98% reflective with say another 3% loss for the λ/4.

With this design, about 45% (one pass through the beamsplitter) of the real world makes it through, but only light polarized the “right way” makes it through which makes looking at say LCD monitors problematical. By using the quarter wave film the design is pretty efficient AFTER you loose about 55% of the LED light in polarizing it initially. There are also less reflection issues because all the films and optics are embedded in glass so you don’t get these air to glass index mismatches of off two surfaces of a relatively thick plate that cause unwanted reflections/double images.

Google Glass design has a lot of downsides too. There is nothing you can do to get the light throughput of the real world much above 45% and there are always the problems of looking through a polarizer. But the biggest downside is that it cannot be scaled up for larger fields of view and/or more eye relief. As you scale this design up the block of glass becomes large, heavy and expensive as well as being very intrusive/distorting in looking through a big thick piece of glass.

Without getting too sidetracked, Lumus in effect takes the one thick beam splitter, and piece-wise cuts it into multiple smaller beam splitters to make the glass thinner. But this also means you can’t use the spherical mirror of a birdbath design with it and so you require optics before the beam splitting and the light losses of the the piece-wise beam splitting are much larger than a single beamsplitter.

Larger Designs

An alternative design would mix the polarizing beamsplitters of the Google Glass design above with the configuration of ODG design above.  And this has been done many times through the years with LCOS panels that use polarized light (an example can be found in this 2003 paper). The spherical mirror/combiner will be a partial non-polarizing mirror so you can see through it and a quarter waveplate is used between the spherical combiner and the polarizing beam splitter. You are then stuck with about 45% of the real world light times the light throughput of the spherical combiner.

A DLP with a “birdbath” would typically use the non-polarizing beam splitter with a design similar to the ODG R-9 but replacing the OLED microdisplay with a DLP and illumination. As an example, Magic Leap did this with a DLP but adding a variable focus lens to support focus planes.

BTW, by the time you polarized the light from an OLED or DLP microdisplay, there would not be much if any of an efficiency advantage sense to use polarizing beamsplitters. Additionally, the light from the OLED is so diffused (varied in angles) that it would likely not behave well going through the beam splitters.

IMMY – Eliminating the Beamsplitter

The biggest light efficiency killer in the birdbath design is the combined reflective/transmissive passes via the beamsplitter. IMMY effectively replaces the beamsplitter of the birdbath design with two small curved mirrors that he correct for the image being reflected off-axis from the larger curved combiner. I have not yet seen how well this design works in practice but at least the numbers would appear to work better. One can expect only a few percentage points of light being lost off of each of the two small mirrors so that maybe 95% of the light from the OLED display make it to the large combiner. Then you have the the combiner reflection percentage (Cr) multiplying by about 95% rather than the roughly 23% of the birdbath beam splitter.

The real world light also benefits as it only has to go through a single combiner transmissive loss (Ct) and no beamsplitter (Bt) loses. Taking the OGD R-9 example above and assuming we started with a 250 nit OLED and with 50 nits to the eye, we could get there with about an 75% transmissive combiner. The numbers are at least starting to get into the ballpark where improvements in OLED Microdisplays could fit at least for indoor use (outdoor designs without sunshading/shutters need on the order of 3,000 to 4,000 nits).

It should be noted that IMMY says they also have “Variable transmission outer lens with segmented addressability” to support outdoor use and variable occlusion. Once again this is their claim, I have not yet tried it out in practice so I don’t know the issues/limitations. My use of IMMY here is to contrast it with the classical birdbath designs above.

A possible downside to the the IMMY multi-mirror design is bulk/size has seen below. Also noticed the two adjustment wheel for each eye. One is for interpupillary distance to make sure the optics line up center with the pupils which varies from person to person. The other knob is a diopter (focus) adjustment which also suggests you can’t wear these over your normal glasses.

As I have said, I have not seen IMMY’s to see how it works and to see what faults it might have (nothing is perfect) so this is in no way an endorsement for their design. The design is so straight forward and a seemingly obvious solution to the beam splitter loss problem that it makes me wonder why nobody has been using it earlier; usually in these cases, there is a big flaw that is not so obvious.

See-Though AR Is Tough Particularly for OLED

As one person told me at CES, “Making a near eye display see-through generally more than double the cost” to which I would add, “it also has serious adverse affects on the image quality“.

The birdbath design wastes a lot of light as do every other see-through designs. Waveguide designs can be equally or more light wasteful than the birdbath. At least on paper, the IMMY design would appear to waste a less than most others. But to make a device say 90% see through, at best you start by throwing away over 90% of the image light/nits generated, and often more than 95%.

The most common solution to day is to start with LED illuminated LCOS or DLP microdisplay so you have a lot of nits to throw at the problem and just accept the light waste. OLEDs are still orders of magnitude in brightness/nits away from being able to compete with LCOS and DLP with brute force.

 

AR/MR Optics for Combining Light for a See-Through Display (Part 1)

combiners-sample-cropIn general, people find the combining of an image with the real world somewhat magical; we see this with heads up displays (HUDs) as well as Augmented/Mixed Reality (AR/MR) headsets.   Unlike Starwars R2D2 projection into thin air which was pure movie magic (i.e. fake/impossible), light rays need something to bounce off to redirect them into a person’s eye from the image source.  We call this optical device that combines the computer image with the real world a “combiner.”

In effect, a combiner works like a partial mirror.  It reflects or redirects the display light to the eye while letting light through from the real world.  This is not, repeat not, a hologram which it is being mistakenly called by several companies today.  Over 99% people think or call “holograms” today are not, but rather simple optical combining (also known as the Pepper’s Ghost effect).

I’m only going to cover a few of the more popular/newer/more-interesting combiner examples.  For a more complete and more technical survey, I would highly recommend a presentation by Kessler Optics. My goal here is not to make anyone an optics expert but rather to gain insight into what companies are doing why.

With headsets, the display device(s) is too near for the human eye to focus and there are other issues such as making a big enough “pupil/eyebox” so the alignment of the display to the eye is not overly critical. With one exception (the Meta 2) there are separate optics  that move apparent focus point out (usually they try to put it in a person’s “far” vision as this is more comfortable when mixing with the real word”.  In the case of Magic Leap, they appear to be taking the focus issue to a new level with “light fields” that I plan to discuss the next article.

With combiners there is both the effect you want, i.e. redirecting the computer image into the person’s eye, with the potentially undesirable effects the combiner will cause in seeing through it to the real world.  A partial list of the issues includes:

  1. Dimming
  2. Distortion
  3. Double/ghost images
  4. Diffraction effects of color separation and blurring
  5. Seeing the edge of the combiner

In addition to the optical issues, the combiner adds weight, cost, and size.  Then there are aesthetic issues, particularly how they make the user’s eye look/or if they affect how others see the user’s eyes; humans are very sensitive to how other people’s eye look (see the EPSON BT-300 below as an example).

FOV and Combiner Size

There is a lot of desire to support a wide Field Of View (FOV) and for combiners a wide FOV means the combiner has to be big.  The wider the FOV and the farther the combiner is from the eye the bigger the combiner has to get (there is not way around this fact, it is a matter of physics).   One way companies “cheat” is to not support a person wearing their glasses at all (like Google Glass did).

The simple (not taking everything into effect) equation (in excel) to computer the minimum width of a combiner is =2*TAN(RADIANS(A1/2))*B1 where A1 is the FOV in degrees and and B1 is the distance to farthest part combiner.  Glasses are typically about 0.6 to 0.8 inches from the eye and the size of the glasses and the frames you want about 1.2 inches or more of eye relief. For a 40 degree wide FOV at 1.2 inches this translates to 0.9″, at 60 degrees 1.4″ and for 100 degrees it is 2.9″ which starts becoming impractical (typical lenses on glasses are about 2″ wide).

For, very wide FOV displays (over 100 degree), the combiner has to be so near your eye that supporting glasses becomes impossible. The formula above will let your try your own assumptions.

Popular/Recent Combiner Types (Part 1)

Below, I am going to go through the most common beam combiner options.  I’m going to start with the simpler/older combiner technologies and work my way to the “waveguide” beam splitters of some of the newest designs in Part 2.  I’m going to try and hit on the main types, but there are many big and small variations within a type

gg-combinerSolid Beam Splitter (Google Glass and Epson BT-300)

These are often used with a polarizing beam splitter polarized when using LCOS microdisplays, but they can also be simple mirrors.  They generally are small due to weight and cost issues such as with the Google Glass at left.  Due to their small size, the user will see the blurry edges of the beam splitter in their field of view which is considered highly undesirable.  bt-300Also as seen in the Epson BT-300 picture (at right), they can make a person’s eyes look strange.  As seen with both the Google Glass and Epson, they have been used with the projector engine(s) on the sides.

Google glass has only about a 13 degree FOV (and did not support using a person’s glasses) and about 1.21 arc-minutes/pixel angular resolution with is on the small end compared to most other headset displays.    The BT-300 about 23 degree (and has enough eye relief to supports most glasses) horizontally and has dual 1280×720 pixels per eye giving it a 1.1 arc-minutes/pixel angular resolution.  Clearly these are on the low end of what people are expecting in terms of FOV and the solid beam quickly becomes too large, heavy, and expensive at the FOV grows.  Interesting they are both are on the small end of their apparent pixel size.

meta-2-combiner-02bSpherical/Semi-Spherical Large Combiner (Meta 2)

While most of the AR/MR companies today are trying to make flatter combiners to support a wide FOV with small microdisplays for each eye, Meta has gone in the opposite direction with dual very large semi-spherical combiners with a single OLED flat panel to support an “almost 90 degree FOV”. Note in the picture of the Meta 2 device that there are essentially two hemispheres integrated together with a single large OLED flat panel above.

Meta 2 uses a 2560 by 1440 pixel display that is split between two eyes.  Allowing for some overlap there will be about 1200 pixel per eye to cover 90 degrees FOV resulting in a rather chunkylarge (similar to Oculus Rift) 4.5 arc-minutes/pixel which I find somewhat poor (a high resolution display would be closer to 1 a-m/pixel).

navdy-unitThe effect of the dual spherical combiners is to act as a magnifying mirror that also move the focus point out in space so the use can focus. The amount of magnification and the apparent focus point is a function of A) the distance from the display to the combiner, B) the distance from the eye to the combiner, and C) the curvature.   I’m pretty familiar with this optical arrangement since the optical design it did at Navdy had  similarly curved combiner, but because the distance from the display to the combiner and the eye to the combiner were so much more, the curvature was less (larger radius).

I wonder if their very low angular resolution was as a result of their design choice of the the large spherical combiner and the OLED display’s available that they could use.   To get the “focus” correct they would need a smaller (more curved) radius for the combiner which also increases the magnification and thus the big chunky pixels.  In theory they could swap out the display for something with higher resolution but it would take over doubling the horizontal resolution to have a decent angular resolution.

I would also be curious how well this large of a plastic combiner will keep its shape over time. It is a coated mirror and thus any minor perturbations are double.  Additionally and strain in the plastic (and there is always stress/strain in plasic) will cause polarization effect issues, say whenlink-ahmd viewing and LCD monitor through it.   It is interesting because it is so different, although the basic idea has been around for a number of years such as by a company called Link (see picture on the right).

Overall, Meta is bucking the trend toward smaller and lighter, and I find their angular resolution disappointing The image quality based on some on-line see-through videos (see for example this video) is reasonably good but you really can’t tell angular resolution from the video clips I have seen.  I do give them big props for showing REAL/TRUE video’s through they optics.

It should be noted that their system at $949 for a development kit is about 1/3 that of Hololens and the ODG R-7 with only 720p per eye but higher than the BT-300 at $750.   So at least on a relative basis, they look to be much more cost effective, if quite a bit larger.

odg-002-cropTilted Thin Flat or Slightly Curved (ODG)

With a wide FOV tilted combiner, the microdisplay and optics are locate above in a “brow” with the plate tilted (about 45 degrees) as shown at left on an Osterhout Design Group (ODG) model R-7 with 1280 by 720 pixel microdisplays per eye.   The R-7 has about a 37 degree FOV and a comparatively OK 1.7 arc-minutes/pixel angular resolution.

odg-rr-7-eyesTilted Plate combiners have the advantage of being the simplest and least expensive way to provide a large field of view while being relatively light weight.

The biggest drawback of the plate combiner is that it takes up a lot of volume/distance in front of the eye since the plate is tilted at about 45 degrees from front to back.  As the FOV gets bigger the volume/distance required also increase.
odg-horizons-50d-fovODG is now talking about a  next model called “Horizon” (early picture at left). Note in the picture at left how the Combiner (see red dots) has become much larger. They claim to have >50 degree FOV and with a 1920 x 1080 display per eyethis works out to an angular resolution of about 1.6 arc-minutes/pixel which is comparitively good.

Their combiner is bigger than absolutely necessary for the ~50 degree FOV.  Likely this is to get the edges of the combiner farther into a person’s peripheral vision to make them less noticeable.

The combiner is still tilted but it looks like it may have some curvature to it which will tend to act as a last stage of magnification and move the focus point out a bit.   The combiner in this picture is also darker than the one in the older R-7 combiner and may have additional coatings on it.

ODG has many years of experience and has done many different designs (for example, see this presentation on Linked-In).  They certainly know about the various forms of flat optical waveguides such as Microsoft’s Hololens is using that I am going to be talking about next time.  In fact,  that Microsoft’s licensed Patent from ODG for  about $150M US — see).

Today, flat or slightly curved thin combiners like ODG is using probably the best all around technology today in terms of size, weight, cost, and perhaps most importantly image quality.   Plate combiners don’t require the optical “gymnastics” and the level of technology and precision that the flat waveguides require.

Next time — High Tech Flat Waveguides

Flat waveguides using diffraction (DOE) and/or holographic optical elements (HOE) are what many think will be the future of combiners.  They certainly are the most technically sophisticated. They promise to make the optics thinner and lighter but the question is whether they have the optical quality and yield/cost to compete yet with simpler methods like what ODG is using on the R-7 and Horizon.

Microsoft and Magic Leap each are spending literally over $1B US each and both are going with some form of flat, thin waveguides. This is a subject to itself that I plan to cover next time.

 

HMD – A huge number of options

HMD montageThere have been a number of comments on this blog that I am very negative about Head Mounted Displays (HMDs), but I believe I am being realistic about the issues with HMDs.   I’m an engineer by training and profession and highly analytical.  Part of building successful products is to understand the issues that must be solved for the product to work.  I have seen a lot of “demo ware” through the years that demo’s great but fails to catch on in the market.

It’s rather funny the vitriolic response from some of the people posting in the comments (some of which are so foul they have been blocked).  Why do they feel so threaten by bringing up issues with HMD?   There must have been over 100 HMDs go to market, some of these with big name companies behind them, over the last 40 years and none of them have succeed in making the break through to a consumer product.   Clearly the problem is harder than people think and it is not from a lack of trying by smart people.  Why is the “101st” attempt going to succeed?

I know there is a lot of marketing and hype going on today with HMDs, I’m trying to cut through that hype to see what is really going on and what the results will be.    I also have seen a lot of group think/chasing each other where a number of companies are researching the same general technology and the panic to get to market after another company has made a big announcement.  Many feel this is going on now with HMD in response to Google Glass.

Designers of HMDs have a huge number of design decision to make and invariably each of these choices result in pro’s and con’s.   Invariably they have to make trade-offs that tend to make the HMD good for some applications and worse for others.    For example, making a display see-through may be a requirement for augmented reality, but it makes them more expensive and worse for watching moves or pictures.    For immersive Virtual Reality the design may want a wide field of view (FOV) optics which for a given display resolution and cost means that you will have low angular resolution making it bad for information displays.

To begin with I would like to outline just some of the basic display modality options:

  1. See-through – Usually for augmented reality.  It has the drawback that it is poor for watching movies, pictures, and seeing detail content because whatever is visible in the real world becomes the “black.”   The optics tend to cost more and end up trading image quality for the ability to see through.   Also while they may be see-through, they invariably have to affect the view of the real world.
  2. Monocular (one-eye) – A bit harder for people to get used to but generally less expensive and easier to adjust.   People usually have one “dominant eye” and/or good eye where the one display should be located.   A non-see-through monocular can provide a bit of a see-through effect, but generally the display dominates.   Monocular HMDs support much more flexible mounting/support options as they don’t have to be precisely located in front center of the eyes.
  3. Binoculars (both eyes) – Generally supports better image quality than monocular. Can more than double the cost and power consumption of the display system (two of most everything plus getting them to work together).  Can support 3-D stereoscopic vision.   The two displays have to be centered properly for both eyes or will cause problems seeing the image and/or eye strain.  More likely to cause disorientation and other ill effects.
  4. Centered Vertically – While perhaps the obvious location, it means that the display will tend to dominate (or in the case of non-see through totally block) the user’s vision of the real world.   Every near eye display technology to at least some extent negatively affect the view of the real world; even see-though displays will tend to darken and/or color change, and/or distort the view.
  5. Above and Below – Usually monocular displays are located above or below the eye so that they don’t impair forward vision when the user looks straight forward.  This is not optimal for extensive use and can cause eye strain.   Generally the above and below position are better for “data snacking” rather than long term use.

Within the above there are many variations and options.   For example, with a see through display you could add sunglasses to darken or totally block the outside light either mechanically or with electronic shutters (which have their own issues), but they will still not be as optimal as a purpose built non-see through display.

Then we have a huge number of issues and choices beyond the display modality that all tend to interact with each other:

  1. Cost – Always an issue and trade-off
  2. Size and Weight – A big issue forHMDs as theyare worn on the head.  There are also issues with how the weightis distributed from front to back and side to side
    1. Weight on the person’s nose – I call this out because it is a particularly problem, any significant weight on the nose will build up and feel worse over time (anyone that has had glasses with glass rather than plastic lenses can tell you).    Therefore there is generally a lot of effort to minimize the weight on the person nose by distributing the force elsewhere, but this generally makes the device more bulky and has issues with messing up the user’s hair.   The nose bridge when use is generally used to center and stabilize the HMD.    Complicating this even more is the wide variety of shapes of the human head and specifically the nose.   And don’t kid yourself thinking that light guides will solve everything, they tend to be heavy as well.
  3. Resolution – Obviously more is better, but it comes at a cost both for the display and optics.  Higher resolution also tends to make everything bigger and take more power.
  4. Field of View (FOV)  – A wider FOV is more immersive and supports more information, but to support a wide FOV with good angular resolution throughout and support high acuity would require an extremely high resolution display with extremely good optics which would be extremely expensive even it possible.   So generally a display either as a wide FOV with low angular resolution or a narrower FOV with higher angular resolution.  Immersive game like application generally chose wider FOV while more informational based displays go with a narrower FOV.
  5. Exit Pupil Size – Basically this means how big the sweet spot is for viewing the image in the optics.  If you have every used an HMD or binoculars you will notice how you have to get them centered right or you will only see part of the image with dark ring around the outside.  As the FOV and Eye relief increase it becomes more and more difficult and expensive to support a reasonable exit pupil.
  6. Vision Blocking (particularly peripheral vision) ­– This can be a serious safety consideration for something you think would wear by walking and/or driving.  All these devices to a greater or less or extend block vision even if the display itself it off.    Light guide type displays are not a panacea in this respect either.  While light guide displays block less in front of the user, they have the image coming in from the sides and end up blocking a significant amount of a person’s side peripheral vision which is use to visually sense things coming toward the person.
  7. Distortion and Light Blocking – Any see-through device by necessity will affect the light coming from the real world.   There has to be a optical surface to “kick” the light toward the eye and then light from the real world has to go through that same surface and is affected.
  8. Eye relieve and use with Glasses ­ – This is an issue of how far the last optical element is away from the eye.   This is made very complicated by the fact that some people wear glasses and that faces have very different shapes.   This is mostly an issue for monocular displays where they often use a “boom” to hold the display optics.    As you want more eye relief, the optics have to get bigger for the same FOV, which means more weight which in turn makes support more of a problem.   This was an issue with Google Glass as they “cheated” by having very little eye relief (to the point that they said they were not meant to be used with glasses
  9. Vision correction – Always and issue as many people don’t have perfect vision and generally the HMD optics want to be in the same place as a person’s glasses.  Moving the HDM optics further away to support glasses makes them bigger and more expensive.   Building corrective lenses in to the HMD itself will have a huge impact on cost (you have to have another set of prescription lenses that a specially fit into the optics).   Some designs have included diopter/focus adjustment some many people also have astigmatism.
  10. Adjustment/Fit – This can be a big can of worms as the more adjustable the device is the better it can be made to fit, but then the more complex it gets to fit is properly.  With binocular displays you then have to adjust/fit both eye which may need moving optics.      
  11. Battery life (and weight) – Obvious issue and they are made worse dual displays.  At some point the battery has to be move either to the back of the head (hope you don’t have a lot of hair back there) or via a cable to someplace other than the head.
  12. Connection/cabling – Everyone wants wireless, but then this means severe compromises in terms of power, weight, support on the head, processing power (heat, battery power, and size).
  13. How it is mounted (head bands, over the head straps, face goggles) – As soon as you start putting much stuff on the head a simple over the ears with a noise bridge is not going to feel comfortable and you start to have to look to other ways to support the weight and hold it steady.   You end up with a lot of bad alternatives that will at a minimum mess with people’s hair.
  14. Appearance ­– The more youtry and do on the head, bigger and bulkier and uglier it is going to get.
    1. Look of the eyes – I break this out separately because human’s are particularly sensitive to how people’s eye’s look.  Many of the HMD displays make the eyes look particularly strange with optical elements right in front of the eyes (see below).  Epson eyes
  15. Storage/fragility – A big issue if this is going to be a product you wear when you go out.   Unlike you cell phone that you can slip in your pocket, HMD don’t generally fold up into a very small form factor/footprint and they are generally too fragile to put in your pock even if they (and with all their straps and cables they may have) would fit.
  16. Input – A very big topic I will save for another day.

 If you take the basic display types with all the permutations and combinations of all the other issues above (and the list is certainly not exhaustive) you get a mind boggling number of different configurations and over the last 40 years almost every one of these has been tried to a greater or lesser extent.  And guess what, none of these have succeeded in the consumer market.   Almost every time you try and improve one of the characteristics above, you hurt another.    Some devices have found industrial and military used (it helps if there is not a lot of hair on the user’s head, they are strong enough to carry some weight on their head, they don’t care what they look like and they are ordered to do the training).

In future posts, I plan on going into more detail on some of the options above.

One last thing on the comment section; I’m happy to let go through comments that disagree with me as it helps both me and others understand the subject.  But I am not going to put up with foul language and personal attacks and will quickly hit the “trash” button so don’t waste your time.   I put this under; if you can’t argue the facts then you trash the person category and I will file it accordingly.