Archive for OLED

Collimation, Etendue, Nits (Background for Understanding Brightness)

Introduction

I’m getting ready to write a much requested set of articles on the pro’s and con’s of various types of microdisplays (LCOS, DLP, and OLED in particular with some discussion of other display types). I feel as a prerequisites, I should give some key information on the character of light as it pertains to what people generally refer too as “brightness”.  For some of my readers, this discussion will be very elementary/crude/imprecise, but it is important to have a least a rudimentary understanding of nits, collimation, and etendue to understand some of the key characteristics of the various types of displays.

Light Measures – Lumens versus Nits

The figure on the left from an Autodesk Workshop Page illustrates some key light measurements. Lumens are a measure of the total light emitted. Candelas (Cd) are a measurement of the light emitted over a solid angle. Lux measures the light per square meter that hits a surface. Nits (Cd/m2) measure light at a solid angle. Key for a near eye display, we only care about the light in the direction that makes it to the eye’s pupil.

We could get more nits by cranking up the light source’s brightness but that would mean wasting a lot of light. More efficiently, we could use optics to try and somehow steer a higher percentage of total light (lumens) to the eye. In this example, we could put lenses and reflectors to aim the light to the surface and we could make the surface more reflective and more directional (known as the “gain” of a screen). Very simply put, lumens is a measure of the total light output from a light source, nits is a measure of light in a specific direction.

Etendue

The casual observer might think, just put a lens in front of or a mirror behind and around the light source (like a car’s headlight) and concentrate the light. And yes this will help but only within limits. The absolute limit is set down by a physics law that can’t be violated known as “etendue.”

There are more detailed definitions, but one of the simplest (and for our purpose practical) principles is given in a presentation by Gaggione on collimating LED light stating that “the beam diameter multiplied by the beam angle is a constant value” [for an ideal element]. In simpler terms, if we put an optical element that concentrates/focuses the light, the angles of the light will increase. This has profound implications in terms of collimating light. Another good presentation, but a bit more technical, on etendue and collimation is given by LPI.

Another law of physics is that etendue can only be increased. This means that the light once generated, the light rays can only becomes more random. Every optical element will hurt/increase etendue. Etendue is analogous to the second law of thermodynamics which states that entropy can only increase.

Lambertian Emitters (Typical of LEDs/OLEDs)

LEDs and OLEDs used in displays tend to be “Lambertian Emitters” where the nits are proportional to the cosine of the angle. The figure on the right shows this for a single emitting point on the surface. A real LED/OLED will will not be a single point, but an area so one can imagine a large set of these emitting points spread two dimensionally.

Square Law and Concentrating Light

It is very important to note that the diagram above shows only a side view. The light rays are spreading as sphere and nits are a measure of light per unit area on the surface a sphere. If the linear spread by is reduced by X, the nits will then increase by X-squared.

Since for a near eye display, the only light that “counts” is that which makes it into a person’s eye, there is a big potential gain in brightness that comes not from making the light source brighter but by reducing the angles of the light rays in the form of collimation.

Collimating Light

Collimation is the process of getting light rays to be a parallel to each other as possible (within the laws of etendue). Collimation of light is required for projecting light (as with projector), making for very high luminance (nits) near eye displays, and for getting light work properly with a waveguide (waveguides require highly collimated light to work at all)

Show below is the classic issue with collimating light. A light source with the center point “2” and the two extreme points point at the left “1” and right “3” edge of a Lambertian emitter are shown. There is a lens (in blue) trying to collimate the light that is located at a distances equal to the focal length of the lens. There is also shown a reflector in dashed blue that is often used to capture and redirect the outermost rays that would bypass the lens.

The “B” figure shows happens when 3 light rays (1a, 2a, and 3a) from the 3 points enter the lens at roughly the same place (indicated by the green circle). The lens can only perfectly collimate the center 2a ray to become 2a’ (dashed line) which exits along with all other rays from the point 2 perfectly parallel/collimated. While rays 1a and 3a have their angle reduced (consistent with the laws of etendue, the output area is larger than the source light area) to 1a’ and 3a’ but are not perfectly parallel to ray 2a’ or each other.

If the size of the light source were larger such that 1 and 3 are farther apart, the angles of rays 3a’ and 1a’ would be more severe and less collimated. Or if the light source were smaller, then the light would be more highly collimated. This illustrates how the emitting area can be traded for angular diversity by the laws of etendue.

Illuminating a Microdisplay (DLP or LCOS) Versus Self Emitting Display (OLED)

Very simply put, what we get conceptually by collimating a small light source (such as set of small RGB LEDs) is a bundle of individual highly collimated light sources to illuminate each pixel of a reflective microdisplay like DLP or LCOS. The DLP or LCOS device pixel mirrors then simply reflect light with the same characteristics with some losses and scattering due to imperfections in the mirror.

The big advantage in terms of intensity/nits for reflective mirodisplays is that they separate the illumination process from the light modulation. They can take a very bright and small LEDs and then highly collimate the light to further increase the nits. It is possible to get many tens of thousands of nits illuminating a reflective microdisplay.

An OLED microdisplay is self emitting and the light is Lambertian which as show above is somewhat-diffuse. Typically OLED microdisplay can emit only about 200 to at most 400 nits for long periods of time (some lab prototypes have claimed up to 5,000 nits, but this is unlikely for long periods of time). Going brighter for long periods of time will cause the OLED materials to degenerate/burn-up.

With the OLED you are somewhat stuck with the type of light, Lambertian, as well as the amount of light. The optics have to preserve the image quality of the individual pixels. If you want to say collimate the Lambertian light, it would have to be done on the individual pixels with miniature optics directly on top of the pixel  (say a microlens like array) to have a small spot size (pixel) to collimate. I have heard several people theorize this might be possible but I have not seen it done.

Next Time Optical Flow

Next time I plan to build on these concepts to lay out the “optical flow” for a see-through (AR) microdisplay headset. I will also discuss some of the issues/requirements.

 

Disney-Lenovo AR Headset – (Part 1 Optics)

Disney Announced Joint AR Development At D23

Disney at their D23 Fan Convention in Anaheim on July 15th, 2017 announced an Augmented Reality (AR) Headset jointly developed with Lenovo. Below is a crop and brightness enhanced still frame capture from Disney’s “Teaser” video.

Disney/Lenovo also released a video from a interview a the D23 convention which gave further details. As the interview showed (see right), the device is based on using a person’s cell phone as the display (similar to Google cardboard and Samsung’s Gear for VR).

Birdbath Optics

Based on analyzing the two videos plus some knowledge of optical systems, it is possible to figure out what they are doing in terms of the optical system. Below is a diagram of what I see them as doing  in terms of optics (you may want to open this in a separate widow to view this figure in the discussion below).

All the visual evidence indicates that Disney/Lenovo  using a classical “birdbath” optical design (discussed in an article on March 03, 2017). The name “birdbath” comes from the used of a spherical semi-mirror with a beam splitter directing light into the mirror. Birdbath optics are used because they are relatively inexpensive, lightweight, support a wide field of view (FOV), and are “on axis” for minimal distortion and focusing issues.

The key element of the birdbath is the curve mirror which is (usually) the only “power” (focus changing) element. The beauty of mirror optics is that they have essentially zero chromatic aberrations whereas is is difficult/expensive to reduce chromatic aberrations with lens optics.

The big drawbacks of birdbath optics include that they block a lot of light both from the display device and the real world and double images from unwanted reflections of “waste” light. Both these negative effects can be seen in the videos.

There would be no practical way (that I know of) to support a see-though display with a cell phone sized display using refractory (lens) optics such as used with Google Cardboard or the Oculus Rift. The only practical ways I know for supporting AR/see-through display using a cell phone size display all use curved combiner/mirrors..

Major Components

Beam Splitter – The design uses a roughly 50/50 semi-mirror beam splitter which has a coating (typically aluminum alloy although it is often called “silver”) that lets about 50 percent of the light through while acting like a mirror for 50% of the light. Polarizing beam splitters would be problematic with using most phones and are much more expensive. You should note that the beam splitter is arranged to kick the image from the phone toward the curved combiner and away from the person’s eyes; thus light from the display is reflected and then has a transmissive pass.

Combiner – The combiner, a spherical semi-mirror is the key to the optics and multiple things. The combiner appears to also be about 50-50 transmissive-mirror. The curved mirror’s first job is to all the user for focus on the phones display which otherwise would be too close to a person’s eyes to support comfortable focusing. The other job of the combiner is to combine the light/image from the “real world” with the display light; it does this with the semi-mirror allowing light from the image to reflect and light from the real world be be directed toward the eye. The curve mirror only has a signification optical power (focus) effect on the reflected display light and very little distortion of the real world.

Clear Protective Shield

As best I can tell from the two videos, the shield is pretty much clear and serves no function other than to protect the rest of the optics.

Light Baffles Between Display Images

One thing seen in the picture at top are some back stepped light baffles to keep light cross-talk down between the eye.

Light Loss (Follow the Red Path)

A huge downside of the birdbath design is the light loss as illustrated in the diagram by the red arrow path where the thickness of the arrows are roughly to scale with the relative amount of light. To keep things simple, I have assumed no other losses (there are typically 2% to 4% per surface).

Starting with 100% of the light leaving the phone display, about 50% of goes through the beam splitter and is lost while the other 50% is reflected to the combiner. The combiner is also about 50% mirrored (a rough assumption), and thus 25% (0.5 X 0.5) of the display’s light has its focus changed and reflected back toward the beam splitter. About 25% of the light also goes through the combiner and causes the image you can see in the picture on the left. The beam splitter in turn allows 50% of the 25% or only about 12.5% of the light to pass toward the eye. Allowing for some practical losses, less than 10% of the light from the phone makes it to the eye.

Double Images and Contrast Loss (Follow the Green Dash Path)

Another major problem with the birdbath optics is that the lost light will bounce around and cause double images and losses in contrast. If you follow the green path, like the red path about 50% of the light will be reflected and 50% will pass through the beamsplitter (not shown on the green path). Unfortunately, a small percentage of the light that is supposed to pass through will be reflected by the glass/plastic to air interface as it tries to exit the beamsplitter as indicated by the green and red dashed lines (part of the red dashed line is obscured). This dashed path will end up causing a faint/ghost image that is offset by thickness of the beamsplitter tilted at 45 degrees. Depending on coatings, this ghost image could be from 1% to 5% of the brightness of the original image.

The image on the left is a crop from a still frame from the video Disney showed at the D23 conference with red arrows I added pointing to double/ghost images (click here for the uncropped image). The demo Disney gave was on a light background and these double images would be even more noticeable on a dark background. These same type of vertically offset double image could be seen in the Osterhaut Design Group (ODG) R8 and R9 headsets that also use a birdbath optical path (see figure on the right).

A general problem with the birdbath design is that there is so much light that is “rattling around” in an optical wedge formed by the display surface (in this case the phone), beamsplitter, and combiner mirror. Noted in the diagram that about 12.5% of the light returning from the combiner mirror reflected off the beam splitter is heading back toward the phone. This light is eventually going to hit the front glass of the phone and while much of it will be absorbed by the phone, some of it is going to reflect back, hit the beam splitter and eventually make it to the eye.

About 80% of the Real World Light Is Blocked

In several frames in the D23 interview video it was possible to see through the optics and make measurements as to the relative brightness looking through and around the optics. This measurement is only rough and and it helped to take it in several different images. The result was that about a 4.5 to 5X difference in brightness looking through the optics.

Looking back at the blue/center line in the optical diagram, about 50% of the light is blocked by the partial mirror combiner and then 50% of that light is block by the beam splitter for a net of 25%. With other practical losses including the shield, this comes close to the roughly 80% (4/5ths) of the light being block.

Is A Cell Phone Bright Enough?

For Movies in a dark room ANSI/SMPTE 196M spec for movies recommends about about 55 nits in a dark room. A cell phone typically has from 500 to 800 peak nits (see Displaymate’s Shootouts for objective measurements), but after about a 90% optical loss the image  would be down to between about 50 and 80 nits, which is possible just enough if the background/room is dark. could be acceptably bright in a moderately dark room.  But if the room light are on, this will be at best marginal even after allowing for the headset blocking about 75 to 80% of the room light between the combiner and the beam splitter.

With AR you are not just looking at a blank wall. To make something look “solid” non/transparent the display image needs to “dominate” by being at least 2X brighter than anything behind it. It becomes even more questionable that there is enough brightness unless there is not a lot of ambient light (or everything in the background is dark colored or the room lights are very dim).

Note, an LCOS or DLP based see-through AR systems can start with about 10 to 30 times or more the brightness (nits) of a cell phone. They do this so they can work in a variety of light conditions after all the other light losses in a system.

Alternative Optical Solution – Meta-2 “Type”

Using a large display like a cell phone rather than microdisplay severely limits the optical choices with a see-through display. Refractive (lens) optics, for example, would be huge and expensive or Fresnel optics with their optical issues.

Meta-2 “Bug-Eye” Combiners

The most obvious alternative to the birdbad would be to go with dual large combiners such as the Meta-2 approach (see left). When I first saw the Disney-Lenovo design, I even thought it might be using the Meta-2 approach (disproven on closer inspection). With Meta-2, the beam splitter is eliminated and two much larger semi-circular combiners (givening a “bug-eye” look) have a direct path to the display.  Still the bug-eyed combiner is not that much larger than the shield on the Disney-Lenovo system. Immediately, you should notice how the user’s eyes are visible which shows how much more light is getting through..

Because there is no beamsplitter, the Meta-2 design is much more optically efficient. Rough measurements from pictures suggest the Meta-2’s combiners pass 60% and thus reflects about 40%. This means with the same display, it would make the display appear 3 to 4 times brighter while allowing about 2.5X of the real world light through as that of the Disney-Lenovo birdbath design.

I have not tested a Meta-2 nor have read any serious technical evaluation (just the usual “ooh-wow” articles), and I have some concerns with the Meta design. The Meta-2 is “off-axis” in that the display is not perfectly perpendicular to the the combiner. One of the virtues of the birdbath is that is it results in a straightforward on-axis design. With the off-axis design, I wonder how well the focus distance is controlled across the FOV.

Also, the Meta-2 combiners are so far from the eye, that a persons two eyes would have optical cross-talk (there is nothing to keep the one eye from seeing what the other eye is seeing such as the baffels in the Disney-Lenovo design). I don’t know how this would affect things in stereo use, but I would be concerned.

In terms of simple image quality, I would think it would favor the single bug-eye style combiner. There are are no secondary reflections caused by the beamsplitter and both the display and the real world would be significantly brighter. In terms of cost, I see pro’s and con’s relative to each design and overall not a huge difference assuming both designs started with a cell phone displays. In terms of weight, I don’t see much of a difference either.

Conclusions

To begin with, I would not expect even good image quality out of a phone-as-a-display AR headset. Even totally purpose built AR display have their problems. Making a device “see-through” generally makes everything more difficult/expensive.

The optical design has to be compromised right from the start to support both LCD and OLED phones that could have different sizes. Making matters worse is the birdbath design with its huge light losses. Add to this the inherent reflections in the birdbath design and I don’t have high hopes for the image quality.

It seems to me a very heavy “lift” even for the Disney and Star Wars brands. We don’t have any details as to the image tracking and room tracking but I would expect like the optics, it will be done on the cheap. I have no inside knowledge, but it almost looks to me that the solution was designed around supporting the Jedi Light Saber shown in the teaser video (right). They need the see-through aspect so the user can see the light saber. But making the headset see-through is a long way to go to support the saber.

BTW, I’m a big Disney fan from way back (have been to the Disney parks around the world multiple times, attended D23 conventions, eaten at Club 33, was a member of the “Advisory Council” in 1999-2000, own over 100 books on Disney, and the one of the largest 1960’s era Disneyland Schuco monorail collections in the world ). I have an understanding and appreciation of Disney fandom, so this is not a knock on Disney in general.

Varjo Foveated Display Part 2 – Region Sizes

Introduction

As discussed in Part 1, the basic concept of foveated display in theory should work to provide high angular resolution with a wide FOV. There is no single display technology today for near-to-eye displays. Microdisplays (LCOS, DLP, and OLED) support high angular resolution but not wide FOV and larger flat panel displays (OLED and LCD) support wide FOV but with low angular resolution.

The image above left includes crops from the picture on Varjo’s web site call “VR Scene Detail” (toward the end of this article is the whole annotated image).  Varjo included both the foveated and un-foveated image from the center of the display. The top rectangle in red it taken from the top edge of the picture where we can just see the transition starting from the foveated image to what Varjo calls the “context” or lower resolution image. Blending is used to avoid an abrupt transition that the eye might notice.

The topic foveated gathered addition interest with Apple’s acquisition of the eye tracking technology company SMI which provided the eye tracking technology for Nvidia’s foveated rendering HMD study (see below). It is not clear at this time why Apple bought SMI, it could be for foveated rendering (f-rendering) and/or foveated display (f-display).

Static Visual Acuity

The common human visual acuity charts (right) give some feel for the why foveation (f-rendering and/or f-display) works. But these graphs are for static images of high contrast black and white line pairs. While we commonly talk about a person normally seeing down to 1 arcminute per pixel (300 dpi at about 10 inches) being good, but people can detect down to about 1/2 arcminute and if you have a long single high contrast line down to about 1/4th of an arcminute. The point here is to understand that these graphs are a one-dimensional slice of a multi-dimensional issue.

For reference, Varjo’s high resolution display has slightly less than 1-arminute/pixel and their context display in their prototype has about 4.7-arcminutes/pixel. More importantly, their high resolution display covers about 20 degrees horizontally and 15 degrees vertically and this is within the range where people could see errors if they are high in contrast based on the visual acuity graphs.

Varjo will be blending to reduce the contrast difference and thus make the transition less noticeable. But on the negative side, with any movement of the eyes, the image on the foveated display will change and the visual system tends to amplify any movement/change.

Foveated Rendering Studies

Frendering, varies the detail/resolution/quality/processing based on where the eyes are looking. This is seen as key in not only reducing the computing requirement but also saving power consumption. F-rendering has been proven to work with many human studies including those done as part of Microsoft’s 2012  and Nvidia’s 2016 papers. F-rendering becomes ever more important as resolution increases.

F-rendering uses a single high resolution display and change the level of rendering detail. It then uses blending between various detail levels to avoid abrupt changes that the eye detect. As the Microsoft and Nvida papers point out, the eye is particularly sensitive to changes/movement.

In the case of the often cited Microsoft 2012, they used 3 levels of detail with two “blend masks” between them as illustrated in their paper (see right). This gave them a very gradual and wide transition, but 3 resolution levels with wide bands of transition are “luxuries” that Varjo can’t have. Varjo only has two possible levels of detail, and as will be shown, they can only afford a narrow transition/bends region. Microsoft 2012 study used only 1920×1080 monitor with a lower resolution central region than Varjo (about half the resolution) and then 3 blending regions that are so broad that that they would be totally impractical for f-display.

Nvidia’s 2016 study (which cites Microsoft 2012) simplified to two levels of detail, fovea and periphery, with a sampling factor of 1 and 4 with a simpler linear blending between the two detail levels. Unfortunately, most of Nvidia’s study was done with a very low angular resolution Oculus headset display with about a 4.7 arcminutes/pixel with a little over 1,000 by 1,000 pixels per eye, the same display as Varjo uses for their low resolution part of the image. Most of the graphs and discussion in the paper was with respect to this low angular resolution headset.

Nvidia 2016 also did some study of a 27″ (diagonal) 2560×1440 monitor with the user 81cm way resulting in an angular resolution of about 1-arcminute and horizontal FOV of 40 degrees which would be more applicable to Varjo’s case. Unfortunately, As the paper states on their user study, “We only evaluate the HMD setup, since the primary goal of our desktop study in Section 3.2 was to confirm our hypothesis for a higher density display.” They only clue they give for the higher resolution system is that, “We set the central foveal radius for this setup to 7.5°.” There was no discussion I could find for how they set the size of the blend region; so it is only a data point.

Comment/Request: I looked around for a study that would be more applicable to Varjo’s case. I was expecting to find a foveated rendering study using say a 4K (3840×2160) television which would support 1 arcminute for 64 by 36 degrees but I did not find it. If you know of such a study let me know.

Foveated Rending is Much Easier Than Foveated Display

Even if we had a f-rendering study of an ~1-arcminute peak resolution system, it would still only give us some insight into the f-display issues. F-rendering, while conceptually similar and likely to to be required to support a f-display (f-display), is significantly simpler.

With f-rendering, everything is mathematical beyond the detection of the eye movement. The size of the high resolution and lower resolution(s) and the blend region(s) can be of arbitrary size to reduce detection and even be dynamic based on contend. The alignment between resolutions is perfectly registered. The color and contrast between resolutions is identical. The resolution of rendering of the high resolution area does not have to scaled/re-sampled to match the background.

Things are much tougher for f-display as there are two physically different displays and the high resolution display has to be optically aligned/moved based on the movement of the eye. The alignment of the display resolution(s) limited by the optics ability to move the apparent location of the high resolution part of the image. There is likely to be some vibration/movement even when aligned. The potential size of the high resolution display as well as the size of the transition region is limited by the size/cost of the microdisplay used. There can be only a single transition. The brightness, color, and contrast will be different between the two physically different displays (even if both are say OLED, the brightness and colors will not be exactly the same). Additionally, the high resolution display’s image will have to be remapped after any optical distortion to match the context/peripheral image; this will both reduce the effective resolution and will introduce movement into the highest resolvable (by the eye) part of the FOV as the foveated display tracks the eye on what otherwise should be say a stationary image.

When asked, Varjo has said that they more capable systems in the lab than the fixed f-display prototype they are showing. But they stopped short of saying whether they have a full up running system and have provide no results of any human studies.

The bottom line here, is that there are many more potential issues with f-display that could prove to be very hard if not practically impossible to solve. A major problem being getting the high res. image to optically move and stop without the eye noticing it. It is impossible to fully understand how will it will work without a full-blown working system and a study with humans and a wide variety of content and user conditions including the user moving their head and reaction of the display and optics.

Varjo’s Current Demo

Varjo is currently demoing a proof of concept system with the foveated/high-resolution image fix and not tracking the center of vision. The diagram below shows the 100 by 100 degree FOV of the current Varjo demonstration system. For the moment at least, let’s assume their next step will be to have a version of this where the center/foveated image moves.

Shown in the figure above is roughly the size of the foveated display region (green rectangle) which covers about 27.4 by 15.4 degrees. The dashed red rectangle show the area covered by the pictures provided by Varjo which does not even fully cover the foveated area (in the pictures they just show the start of the  transition/blending from high to low resolution).

Also shown is a dashed blue circle with the  7.5 degree “central fovial radius” (15 degree diameter) circle of the Nvidia 2016 high angular resolution system. It is interesting that it is pretty close to angle covered vertically by the Varjo display.

Will It Be Better Than A Non-Foveated Display (Assuming Very Good Eye Tracking)?

Varjo’s Foveated display should appear to the human eye as having much higher resolution than an non-foveated display of with the same resolution as Varjo’s context/periphery display. It is certainly going to work well when totally stationary (such as Varjo’s demo system).

My major concern comes (and something that can’t be tested without a full blown system) when everything moves. The evidence above suggests that there may be visible moving noise at the boundaries of the foveated and context image.

Some of the factors that could affect the results:

  1. Size of the foveated/central image. Making this bigger would move the transition further out. This could be done optically or with a bigger device. Doing it optically could be expensive/difficult and using a larger device could be very expensive.
  2. The size of the transition/blur between the high and low resolution regions. It might be worth losing some of the higher resolution to cause a smoother transition. From what I can tell, Varjo a small transition/blend region compared to the f-rendering systems.
  3. The accuracy of the tracking and placement of the foveated image. In particular how accurately they can optically move the image. I wonder how well this will work in practice and will it have problems with head movement causing vibration.
  4. How fast they can move the foveated image and have it be totally still while displaying.
A Few Comments About Re-sampling of the Foveated Image

One should also note that the moving foveated image will by necessity have to be mapped onto the stationary low resolution image. Assuming the rendering pipeline first generates a rectangular coordinated image and then re-samples it to adjust for the placement and optical distortion of the foveated image, the net effective resolution will be about half that of the “native” display due to the re-sampling.

In theory, this re-sampling loss could be avoided/reduce by computing the high resolution image with the foveated image already remapped, but with “conventional” pipelines this would add a lot of complexity. But this type of display would likely in the long run be used in combination with foveated rendering where this may not be adding too much more to the pipeline (just something to deal with the distortion).

Annotated Varjo Image

First, I  want to complement Varjo for putting actual through the optics high resoluion images on their website (note, click on their “Full size JPG version“). By Varjo’s own admission, these pictures were taken crudely with a consumer camera so the image quality is worse than you would see looking into the optics directly. In particular there are chroma aberrations that are clearly visible in the full size image that are likely caused by the camera and how it was use and not necessarily a problem with Varjo’s optics. If you click on the image below, it will bring up the full size image (over 4,000 by 4,000 pixels and about 4.5 megabytes) in a new tab.

If you look at the green rectangle, it corresponds to size of the foveated image in the green rectangle the prior diagram showing the whole 100 by 100 degree FOV.

You should be able to clearly see the transition/blending starting at the top and bottom of the foveated image (see also right). The end of the blending is cutoff in the picture.

The angles give in the figure were calculated based on the known pixel size of the Oculus CV1 display (their pixels are clearly visible in the non-foveated picture). For the “foveated display” (green rectangle) I used Varjo’s statement that it was at least 70 pixels/degree (but I suspect not much more than that either).

Next Time On Foveated Displays (Part 3)

Next time on this topic, I plan on discussion how f-displays may or may not compete in the future with higher resolution single displays.

Varjo Foveated Display (Part 1)

Introduction

The startup Varjo recently announced and did a large number of interviews with the technical press about their Foveated Display (FD) Technology. I’m going to break this article into multiple parts, as currently planned, the first part will discuss the concept and the need for and part 2 will discuss how well I think it will work.

How It Is Suppose to Work

Varjo’s basic concept is relatively simple (see figure at left – click on it to pop it out). Varjo optically combines a OLED microdisplay with small pixels to give high angular resolution over a small area (what they call the “foveated display“), with a larger OLED display to give low angular resolution over a large area (what they call the “context display“). By eye tracking (not done in the current prototype), the foveated display is optically moved to be in the center of the person’s vision by tilting the beam splitter. Varjo says they have thought of and are patenting other ways of optically combining and moving the foveated image other than a beam splitter.

The beam splitter is likely just a partially silvered mirror. It could be 50/50 or some other ratio to match the brightness of the large and microdisplay OLED. This type of combining is very old and well understood. They likely will blend/fade-in the image in the rectangular boarder where the two display images meet.

The figure above is based on a sketch by Urho Konttori, CEO of Varjo in a video interview with Robert Scoble combined with pictures of the prototype in Ubergismo (see below), plus answers to some questions I posed to Varjo. It is roughly drawn to scale based on the available information. The only thing I am not sure about is the “microdisplay lens” which was shown but not described in the Scoble interview. This lens(es) may or may not be necessary based on the distance of the microdisplay from the beam combiner and could be used to help make the microdisplay pixels appear smaller or larger. If the optical path though the beam combiner to large OLED (in the prototype from an Oculus headset) would equal the path from to the microdisplay via reflecting off the combiner, then the microdisplay lens would not be necessary. Based on my scale drawing and looking at the prototype photographs it would be close to not needing the lens.

Varjo is likely using either an eMagin OLED microdisplay with a 9.3 micron pixel pitch or a Sony OLED microdisplay with a 8.7 micron pixel pitch. The Oculus headset OLED has ~55.7 micron pixel pitch. It does not look from the configuration like the microdisplay image will be magnified or shrunk significantly relative to the larger OLED. Making this assumption, the microdisplay image is about 55.7/9 = ~6.2 time smaller linearly or effectively ~38 times the pixels per unit area. This ~38 times the area means effectively 38 times the pixels over the large OLED alone.

The good thing about this configuration is that it is very simple and straightforward and is a classically simple way to combine two image, at least that is the way it looks. But the devil is often in the details, particularly in what the prototype is not doing.

Current Varjo Prototype Does Not Track the Eye

The Varjo “prototype” (picture at left from is from Ubergismo) is more of a concept demonstrator in that it does not demonstrate moving the high resolution image with eye tracking. The current unit is based on a modified Oculus headset (obvious from the picture, see red oval I added to the picture). They are using the two Oculus larger OLED displays the context (wide FOV) image and have added an OLED microdisplay per eye for the foveated display. In this prototype, they have a static beam splitter to combine the two images. In the prototype, the location of the high resolution part of the image is fixed/static and requires that the user look straight ahead to get the foveated effect. While eye tracking is well understood, it is not clear how successfully they can make the high resolution inset image track the eye and whether the a human will notice the boundary (I will save the rest of this discussion for part 2).

Foveated Displays Raison D’être

Near eye display resolution is improving at a very slow rate and is unlikely to dramatically improve. People quoting “Moore’s Law” applying to display devices are simply either dishonest or don’t understand the problems. Microdisplays (on I.C.s) are already being limited by the physics of diffraction as their pixels (or color sub-pixels) get withing 5 times the wavelengths of visible light. The cost of making microdisplays bigger to support more pixels drives the cost up dramatically and this not rapidly improving; thus high resolution microdisplays are still and will remain very expensive.

Direct view display technologies while they have become very good at making large high resolution display, they can’t be make small enough for lightweight head-mounted displays with high angular resolution. As I discussed the Gap in Pixel Sizes (and for reference, I have included the chart from that article) which I published before I heard of Varjo, microdisplays enable high angular resolution but small FOV while adapted direct view display support low angular resolution with a wide FOV. I was already planning on explaining why Foveated Displays are the only way in the foreseeable future to support high angular resolution with a wide FOV: So from my perspective, Varjo’s announcement was timely.

Foveated Displays In Theory Should Work

It is well known that the human eye’s resolution falls off considerably from the high resolution fovea/center vision to the peripheral vision (see the typical graph at right). I should caution, that this is for a still image and that the human visual system is not this simple; in particular it has sensitivity to motion that this graph can’t capture.

It has been well proven by many research groups that if you can track the eye and provide variable resolution the eye cannot tell the difference from a high resolution display (a search for “Foveated” will turn up many references and videos). The primary use today is with Foveated Rendering to greatly reduce the computational requirements of VR environment.

Varjo is trying to exploit the same foveated effect to gives effectively very high resolution from two (per eye) much lower resolution displays. In theory, it could work but will in in practice?  In fact, the idea of a “Foveated Display” is not new. Magic Leap discussed it in their patents with a fiber scanning display. Personally, the idea seems to come up a lot in “casual discussions” on the limits of display resolution. The key question becomes: Is Varjo’s approach going to be practical and will it work well?

Obvious Issues With Varjo’s Foveated Display

The main lens (nearest the eye) is designed to bring the large OLED in focus like most of today’s VR headsets. And the first obvious issues is that the lens in a typical VR headset is designed resolve pixels that are more than 6 times smaller. Typical VR headsets lenses are, well . . ., cheap crap with horrible image quality. To some degree, they are deliberately blurring/bad to try and hide the screen door effect of the highly magnified large display. But the Varjo headset would need vastly better, and much more expensive, and likely larger and heavier optics for the foveated display; for example instead of using a simple cheap plastic lens, they may need a multiple element (multiple lenses) and perhaps made of glass.

The next issue is that of the tilting combiner and the way it moves the image. For simple up down movement of the foveated display’s image will follow a simple path up/down path, but if the 45 degree angle mirror tilts side to side the center of the image will follow an elliptical path and rotate making it more difficult to align with the context image.

I would also be very concerned about the focus of the image as the mirror tilts through of the range as the path lengths from the microdisplay to the main optics changes both to the center (which might be fixable by complex movement of the beam splitter) and the corners (which may be much more difficult to solve).

Then there is the general issue of will the user be able to detect the blend point between the foveated and context displays. They have to map the rotated foveated image match the context display which will loose (per Nyquist re-sampling) about 1/2 the resolution of the foveated image. While they will likely try cross-fade between the foveated and context display, I am concerned (to be addressed in more detail in part 2) that the visible/human detectable particularly when things move (the eye is very sensitive to movement).

What About Vergence/Accommodation (VAC)?

The optical configuration of Varjo’s Foveated Display is somewhat similar to that of Oculus’s VAC display. Both leverage a beam splitter, but then how would you do VAC with a Foveated Display?

In my opinion, solving the resolution with wide field of view is a more important/fundamentally necessary problem to solve that VAC at the moment. It is not that VAC is not a real issue, but if you don’t have resolution with wide FOV, then VAC is not really necessary?

At the same time, this points out how far away headsets that “solve all the world’s problems” are from production. If you believe that high resolution with a wide field of view that also address VAC, you may be in for a many decades wait.

Does Varjo Have a Practical Foveated Display Solution?

So the problem with display resolution/FOV growth is real and in theory a foveated display could address this issue. But has Varjo solved it? At this point, I am not convinced, and I will try and work though some numbers and more detail reasoning in part 2.