Archive for Intellectual Property

Varjo Foveated Display Part 2 – Region Sizes

Introduction

As discussed in Part 1, the basic concept of foveated display in theory should work to provide high angular resolution with a wide FOV. There is no single display technology today for near-to-eye displays. Microdisplays (LCOS, DLP, and OLED) support high angular resolution but not wide FOV and larger flat panel displays (OLED and LCD) support wide FOV but with low angular resolution.

The image above left includes crops from the picture on Varjo’s web site call “VR Scene Detail” (toward the end of this article is the whole annotated image).  Varjo included both the foveated and un-foveated image from the center of the display. The top rectangle in red it taken from the top edge of the picture where we can just see the transition starting from the foveated image to what Varjo calls the “context” or lower resolution image. Blending is used to avoid an abrupt transition that the eye might notice.

The topic foveated gathered addition interest with Apple’s acquisition of the eye tracking technology company SMI which provided the eye tracking technology for Nvidia’s foveated rendering HMD study (see below). It is not clear at this time why Apple bought SMI, it could be for foveated rendering (f-rendering) and/or foveated display (f-display).

Static Visual Acuity

The common human visual acuity charts (right) give some feel for the why foveation (f-rendering and/or f-display) works. But these graphs are for static images of high contrast black and white line pairs. While we commonly talk about a person normally seeing down to 1 arcminute per pixel (300 dpi at about 10 inches) being good, but people can detect down to about 1/2 arcminute and if you have a long single high contrast line down to about 1/4th of an arcminute. The point here is to understand that these graphs are a one-dimensional slice of a multi-dimensional issue.

For reference, Varjo’s high resolution display has slightly less than 1-arminute/pixel and their context display in their prototype has about 4.7-arcminutes/pixel. More importantly, their high resolution display covers about 20 degrees horizontally and 15 degrees vertically and this is within the range where people could see errors if they are high in contrast based on the visual acuity graphs.

Varjo will be blending to reduce the contrast difference and thus make the transition less noticeable. But on the negative side, with any movement of the eyes, the image on the foveated display will change and the visual system tends to amplify any movement/change.

Foveated Rendering Studies

Frendering, varies the detail/resolution/quality/processing based on where the eyes are looking. This is seen as key in not only reducing the computing requirement but also saving power consumption. F-rendering has been proven to work with many human studies including those done as part of Microsoft’s 2012  and Nvidia’s 2016 papers. F-rendering becomes ever more important as resolution increases.

F-rendering uses a single high resolution display and change the level of rendering detail. It then uses blending between various detail levels to avoid abrupt changes that the eye detect. As the Microsoft and Nvida papers point out, the eye is particularly sensitive to changes/movement.

In the case of the often cited Microsoft 2012, they used 3 levels of detail with two “blend masks” between them as illustrated in their paper (see right). This gave them a very gradual and wide transition, but 3 resolution levels with wide bands of transition are “luxuries” that Varjo can’t have. Varjo only has two possible levels of detail, and as will be shown, they can only afford a narrow transition/bends region. Microsoft 2012 study used only 1920×1080 monitor with a lower resolution central region than Varjo (about half the resolution) and then 3 blending regions that are so broad that that they would be totally impractical for f-display.

Nvidia’s 2016 study (which cites Microsoft 2012) simplified to two levels of detail, fovea and periphery, with a sampling factor of 1 and 4 with a simpler linear blending between the two detail levels. Unfortunately, most of Nvidia’s study was done with a very low angular resolution Oculus headset display with about a 4.7 arcminutes/pixel with a little over 1,000 by 1,000 pixels per eye, the same display as Varjo uses for their low resolution part of the image. Most of the graphs and discussion in the paper was with respect to this low angular resolution headset.

Nvidia 2016 also did some study of a 27″ (diagonal) 2560×1440 monitor with the user 81cm way resulting in an angular resolution of about 1-arcminute and horizontal FOV of 40 degrees which would be more applicable to Varjo’s case. Unfortunately, As the paper states on their user study, “We only evaluate the HMD setup, since the primary goal of our desktop study in Section 3.2 was to confirm our hypothesis for a higher density display.” They only clue they give for the higher resolution system is that, “We set the central foveal radius for this setup to 7.5°.” There was no discussion I could find for how they set the size of the blend region; so it is only a data point.

Comment/Request: I looked around for a study that would be more applicable to Varjo’s case. I was expecting to find a foveated rendering study using say a 4K (3840×2160) television which would support 1 arcminute for 64 by 36 degrees but I did not find it. If you know of such a study let me know.

Foveated Rending is Much Easier Than Foveated Display

Even if we had a f-rendering study of an ~1-arcminute peak resolution system, it would still only give us some insight into the f-display issues. F-rendering, while conceptually similar and likely to to be required to support a f-display (f-display), is significantly simpler.

With f-rendering, everything is mathematical beyond the detection of the eye movement. The size of the high resolution and lower resolution(s) and the blend region(s) can be of arbitrary size to reduce detection and even be dynamic based on contend. The alignment between resolutions is perfectly registered. The color and contrast between resolutions is identical. The resolution of rendering of the high resolution area does not have to scaled/re-sampled to match the background.

Things are much tougher for f-display as there are two physically different displays and the high resolution display has to be optically aligned/moved based on the movement of the eye. The alignment of the display resolution(s) limited by the optics ability to move the apparent location of the high resolution part of the image. There is likely to be some vibration/movement even when aligned. The potential size of the high resolution display as well as the size of the transition region is limited by the size/cost of the microdisplay used. There can be only a single transition. The brightness, color, and contrast will be different between the two physically different displays (even if both are say OLED, the brightness and colors will not be exactly the same). Additionally, the high resolution display’s image will have to be remapped after any optical distortion to match the context/peripheral image; this will both reduce the effective resolution and will introduce movement into the highest resolvable (by the eye) part of the FOV as the foveated display tracks the eye on what otherwise should be say a stationary image.

When asked, Varjo has said that they more capable systems in the lab than the fixed f-display prototype they are showing. But they stopped short of saying whether they have a full up running system and have provide no results of any human studies.

The bottom line here, is that there are many more potential issues with f-display that could prove to be very hard if not practically impossible to solve. A major problem being getting the high res. image to optically move and stop without the eye noticing it. It is impossible to fully understand how will it will work without a full-blown working system and a study with humans and a wide variety of content and user conditions including the user moving their head and reaction of the display and optics.

Varjo’s Current Demo

Varjo is currently demoing a proof of concept system with the foveated/high-resolution image fix and not tracking the center of vision. The diagram below shows the 100 by 100 degree FOV of the current Varjo demonstration system. For the moment at least, let’s assume their next step will be to have a version of this where the center/foveated image moves.

Shown in the figure above is roughly the size of the foveated display region (green rectangle) which covers about 27.4 by 15.4 degrees. The dashed red rectangle show the area covered by the pictures provided by Varjo which does not even fully cover the foveated area (in the pictures they just show the start of the  transition/blending from high to low resolution).

Also shown is a dashed blue circle with the  7.5 degree “central fovial radius” (15 degree diameter) circle of the Nvidia 2016 high angular resolution system. It is interesting that it is pretty close to angle covered vertically by the Varjo display.

Will It Be Better Than A Non-Foveated Display (Assuming Very Good Eye Tracking)?

Varjo’s Foveated display should appear to the human eye as having much higher resolution than an non-foveated display of with the same resolution as Varjo’s context/periphery display. It is certainly going to work well when totally stationary (such as Varjo’s demo system).

My major concern comes (and something that can’t be tested without a full blown system) when everything moves. The evidence above suggests that there may be visible moving noise at the boundaries of the foveated and context image.

Some of the factors that could affect the results:

  1. Size of the foveated/central image. Making this bigger would move the transition further out. This could be done optically or with a bigger device. Doing it optically could be expensive/difficult and using a larger device could be very expensive.
  2. The size of the transition/blur between the high and low resolution regions. It might be worth losing some of the higher resolution to cause a smoother transition. From what I can tell, Varjo a small transition/blend region compared to the f-rendering systems.
  3. The accuracy of the tracking and placement of the foveated image. In particular how accurately they can optically move the image. I wonder how well this will work in practice and will it have problems with head movement causing vibration.
  4. How fast they can move the foveated image and have it be totally still while displaying.
A Few Comments About Re-sampling of the Foveated Image

One should also note that the moving foveated image will by necessity have to be mapped onto the stationary low resolution image. Assuming the rendering pipeline first generates a rectangular coordinated image and then re-samples it to adjust for the placement and optical distortion of the foveated image, the net effective resolution will be about half that of the “native” display due to the re-sampling.

In theory, this re-sampling loss could be avoided/reduce by computing the high resolution image with the foveated image already remapped, but with “conventional” pipelines this would add a lot of complexity. But this type of display would likely in the long run be used in combination with foveated rendering where this may not be adding too much more to the pipeline (just something to deal with the distortion).

Annotated Varjo Image

First, I  want to complement Varjo for putting actual through the optics high resoluion images on their website (note, click on their “Full size JPG version“). By Varjo’s own admission, these pictures were taken crudely with a consumer camera so the image quality is worse than you would see looking into the optics directly. In particular there are chroma aberrations that are clearly visible in the full size image that are likely caused by the camera and how it was use and not necessarily a problem with Varjo’s optics. If you click on the image below, it will bring up the full size image (over 4,000 by 4,000 pixels and about 4.5 megabytes) in a new tab.

If you look at the green rectangle, it corresponds to size of the foveated image in the green rectangle the prior diagram showing the whole 100 by 100 degree FOV.

You should be able to clearly see the transition/blending starting at the top and bottom of the foveated image (see also right). The end of the blending is cutoff in the picture.

The angles give in the figure were calculated based on the known pixel size of the Oculus CV1 display (their pixels are clearly visible in the non-foveated picture). For the “foveated display” (green rectangle) I used Varjo’s statement that it was at least 70 pixels/degree (but I suspect not much more than that either).

Next Time On Foveated Displays (Part 3)

Next time on this topic, I plan on discussion how f-displays may or may not compete in the future with higher resolution single displays.

Varjo Foveated Display (Part 1)

Introduction

The startup Varjo recently announced and did a large number of interviews with the technical press about their Foveated Display (FD) Technology. I’m going to break this article into multiple parts, as currently planned, the first part will discuss the concept and the need for and part 2 will discuss how well I think it will work.

How It Is Suppose to Work

Varjo’s basic concept is relatively simple (see figure at left – click on it to pop it out). Varjo optically combines a OLED microdisplay with small pixels to give high angular resolution over a small area (what they call the “foveated display“), with a larger OLED display to give low angular resolution over a large area (what they call the “context display“). By eye tracking (not done in the current prototype), the foveated display is optically moved to be in the center of the person’s vision by tilting the beam splitter. Varjo says they have thought of and are patenting other ways of optically combining and moving the foveated image other than a beam splitter.

The beam splitter is likely just a partially silvered mirror. It could be 50/50 or some other ratio to match the brightness of the large and microdisplay OLED. This type of combining is very old and well understood. They likely will blend/fade-in the image in the rectangular boarder where the two display images meet.

The figure above is based on a sketch by Urho Konttori, CEO of Varjo in a video interview with Robert Scoble combined with pictures of the prototype in Ubergismo (see below), plus answers to some questions I posed to Varjo. It is roughly drawn to scale based on the available information. The only thing I am not sure about is the “microdisplay lens” which was shown but not described in the Scoble interview. This lens(es) may or may not be necessary based on the distance of the microdisplay from the beam combiner and could be used to help make the microdisplay pixels appear smaller or larger. If the optical path though the beam combiner to large OLED (in the prototype from an Oculus headset) would equal the path from to the microdisplay via reflecting off the combiner, then the microdisplay lens would not be necessary. Based on my scale drawing and looking at the prototype photographs it would be close to not needing the lens.

Varjo is likely using either an eMagin OLED microdisplay with a 9.3 micron pixel pitch or a Sony OLED microdisplay with a 8.7 micron pixel pitch. The Oculus headset OLED has ~55.7 micron pixel pitch. It does not look from the configuration like the microdisplay image will be magnified or shrunk significantly relative to the larger OLED. Making this assumption, the microdisplay image is about 55.7/9 = ~6.2 time smaller linearly or effectively ~38 times the pixels per unit area. This ~38 times the area means effectively 38 times the pixels over the large OLED alone.

The good thing about this configuration is that it is very simple and straightforward and is a classically simple way to combine two image, at least that is the way it looks. But the devil is often in the details, particularly in what the prototype is not doing.

Current Varjo Prototype Does Not Track the Eye

The Varjo “prototype” (picture at left from is from Ubergismo) is more of a concept demonstrator in that it does not demonstrate moving the high resolution image with eye tracking. The current unit is based on a modified Oculus headset (obvious from the picture, see red oval I added to the picture). They are using the two Oculus larger OLED displays the context (wide FOV) image and have added an OLED microdisplay per eye for the foveated display. In this prototype, they have a static beam splitter to combine the two images. In the prototype, the location of the high resolution part of the image is fixed/static and requires that the user look straight ahead to get the foveated effect. While eye tracking is well understood, it is not clear how successfully they can make the high resolution inset image track the eye and whether the a human will notice the boundary (I will save the rest of this discussion for part 2).

Foveated Displays Raison D’être

Near eye display resolution is improving at a very slow rate and is unlikely to dramatically improve. People quoting “Moore’s Law” applying to display devices are simply either dishonest or don’t understand the problems. Microdisplays (on I.C.s) are already being limited by the physics of diffraction as their pixels (or color sub-pixels) get withing 5 times the wavelengths of visible light. The cost of making microdisplays bigger to support more pixels drives the cost up dramatically and this not rapidly improving; thus high resolution microdisplays are still and will remain very expensive.

Direct view display technologies while they have become very good at making large high resolution display, they can’t be make small enough for lightweight head-mounted displays with high angular resolution. As I discussed the Gap in Pixel Sizes (and for reference, I have included the chart from that article) which I published before I heard of Varjo, microdisplays enable high angular resolution but small FOV while adapted direct view display support low angular resolution with a wide FOV. I was already planning on explaining why Foveated Displays are the only way in the foreseeable future to support high angular resolution with a wide FOV: So from my perspective, Varjo’s announcement was timely.

Foveated Displays In Theory Should Work

It is well known that the human eye’s resolution falls off considerably from the high resolution fovea/center vision to the peripheral vision (see the typical graph at right). I should caution, that this is for a still image and that the human visual system is not this simple; in particular it has sensitivity to motion that this graph can’t capture.

It has been well proven by many research groups that if you can track the eye and provide variable resolution the eye cannot tell the difference from a high resolution display (a search for “Foveated” will turn up many references and videos). The primary use today is with Foveated Rendering to greatly reduce the computational requirements of VR environment.

Varjo is trying to exploit the same foveated effect to gives effectively very high resolution from two (per eye) much lower resolution displays. In theory, it could work but will in in practice?  In fact, the idea of a “Foveated Display” is not new. Magic Leap discussed it in their patents with a fiber scanning display. Personally, the idea seems to come up a lot in “casual discussions” on the limits of display resolution. The key question becomes: Is Varjo’s approach going to be practical and will it work well?

Obvious Issues With Varjo’s Foveated Display

The main lens (nearest the eye) is designed to bring the large OLED in focus like most of today’s VR headsets. And the first obvious issues is that the lens in a typical VR headset is designed resolve pixels that are more than 6 times smaller. Typical VR headsets lenses are, well . . ., cheap crap with horrible image quality. To some degree, they are deliberately blurring/bad to try and hide the screen door effect of the highly magnified large display. But the Varjo headset would need vastly better, and much more expensive, and likely larger and heavier optics for the foveated display; for example instead of using a simple cheap plastic lens, they may need a multiple element (multiple lenses) and perhaps made of glass.

The next issue is that of the tilting combiner and the way it moves the image. For simple up down movement of the foveated display’s image will follow a simple path up/down path, but if the 45 degree angle mirror tilts side to side the center of the image will follow an elliptical path and rotate making it more difficult to align with the context image.

I would also be very concerned about the focus of the image as the mirror tilts through of the range as the path lengths from the microdisplay to the main optics changes both to the center (which might be fixable by complex movement of the beam splitter) and the corners (which may be much more difficult to solve).

Then there is the general issue of will the user be able to detect the blend point between the foveated and context displays. They have to map the rotated foveated image match the context display which will loose (per Nyquist re-sampling) about 1/2 the resolution of the foveated image. While they will likely try cross-fade between the foveated and context display, I am concerned (to be addressed in more detail in part 2) that the visible/human detectable particularly when things move (the eye is very sensitive to movement).

What About Vergence/Accommodation (VAC)?

The optical configuration of Varjo’s Foveated Display is somewhat similar to that of Oculus’s VAC display. Both leverage a beam splitter, but then how would you do VAC with a Foveated Display?

In my opinion, solving the resolution with wide field of view is a more important/fundamentally necessary problem to solve that VAC at the moment. It is not that VAC is not a real issue, but if you don’t have resolution with wide FOV, then VAC is not really necessary?

At the same time, this points out how far away headsets that “solve all the world’s problems” are from production. If you believe that high resolution with a wide field of view that also address VAC, you may be in for a many decades wait.

Does Varjo Have a Practical Foveated Display Solution?

So the problem with display resolution/FOV growth is real and in theory a foveated display could address this issue. But has Varjo solved it? At this point, I am not convinced, and I will try and work though some numbers and more detail reasoning in part 2.

Avegant “Light Field” Display – Magic Leap at 1/100th the Investment?

Surprised at CES 2017 – Avegant Focus Planes (“Light Field”)

While at CES 2017 I was invited to Avegant’s Suite and was expecting to see a new and improved and/or a lower cost version of the Avegant Glyph. The Glyph  was a hardly revolutionary; it is a DLP display based, non-see-through near eye display built into a set of headphones with reasonably good image quality. Based on what I was expecting, it seemed like a bit much to be signing an NDA just to see what they were doing next.

But what Avegant showed was essentially what Magic Leap (ML) has been claiming to do in terms of focus planes/”light-fields” with vergence & accommodation.  But Avegant had accomplished this with likely less than 1/100th the amount of money ML is reported to have raised (ML has raised to date about $1.4 billion). In one stroke they made ML more believable and at the same time raises the question why ML needed so much money.

What I saw – Technology Demonstrator

I was shown was a headset with two HDMI cables for video and USB cable for power and sensor data going to an external desktop computer all bundle together. A big plus for me was that there enough eye relief that I could wear my own glasses (I have severe astigmatism so just diopter adjustments don’t work for me). The picture at left is the same or similar prototype I wore. The headset was a bit bulkier than say Hololens, plus the bundle of cables coming out of it. Avegant made it clear that this was an engineering prototype and nowhere near a finished product.

The mixed reality/see-through headset merges the virtual world with the see-through real world. I was shown three (3) mixed reality (MR) demos, a moving Solar System complete with asteroids, a Fish Tank complete with fish swimming around objects in the room and a robot/avatar woman.

Avegant makes the point that the content was easily ported from Unity into their system with fish tank video model coming from the Monterrey Bay Aquarium and the woman and solar system being downloaded from the Unity community open source library.  The 3-D images were locked to the “real world” taking this from simple AR into be MR. The tracking was not all perfect, nor did I care, the point of the demo was the focal planes, lots of companies are working on tracking.

It is easy to believe that by “turning the crank” they can eliminate the bulky cables and  the tracking and locking to between the virtual and real world will improve. It was a technology capability demonstrator and on that basis it has succeeded.

What Made It Special – Multiple Focal Planes / “Light Fields”

What ups the game from say Hololens and takes it into the realm of Magic Leap is that it supported simultaneous focal planes, what Avegant call’s “Light Fields” (a bit different than true “light fields” to as I see it). The user could change what they were focusing in the depth of the image and bring things that were close or far into focus. In other words, they simultaneously present to the eye multiple focuses. You could also by shifting your eyes see behind objects a bit. This clearly is something optically well beyond Hololens which does simple stereoscopic 3-D and in no way presents multiple focus points to the eye at the same time.

In short, what I was seeing in terms of vergence and accommodation was everything Magic Leap has been claiming to do. But Avegant has clearly spent only very small fraction of the development cost and it was at least portable enough they had it set up in a hotel room and with optics that look to be economical to make.

Now it was not perfect nor was Avegant claiming it to be at this stage. I could see some artifacts, in particularly lots of what looked like faint diagonal lines. I’m not sure if these were a result of the multiple focal planes or some other issue such as a bug.

Unfortunately the only available “through the lens” video currently available is at about 1:01 in Avegant’s Introducing Avegant Light Field” Vimeo video. There are only a few seconds and it really does not demonstrate the focusing effects well.

Why Show Me?

So why were they more they were showing it to me, an engineer and known to be skeptical of demos? They knew of my blog and why I was invited to see the demo. Avegant was in some ways surprising open about what they were doing and answered most, but not all, of my technical questions. They appeared to be making an effort to make sure people understand it really works. It seems clear they wanted someone who would understand what they had done and could verify it it some something different.

What They Are Doing With the Display

While Avegant calls their technology “Light Fields” it is implemented with (directly quoting them) “a number of fixed digital focal planes, and then interpolate the planes in-between them.” Multiple focus planes have many of the same characteristics at classical light fields, but require much less image data be simultaneously presented to the eye and thus saving power on generating and displaying as much image data, much of which the eye will not “see”/use.

They are currently using a 720p DLP per eye for the display engine but they said they thought they could support other display technologies in the future. As per my discussion on Magic Leap from November 2016, DLP has a high enough field rate that they could support displaying multiple images with the focus changing between images if you can change the focus fast enough. If you are willing to play with (reduce) color depth, DLP could support a number of focus planes. Avegant would not confirm if they use time sequential focus planes, but I think it likely.

They are using “birdbath optics” per my prior article with a beam splitter and spherical semi-mirror /combiner (see picture at left). With a DLP illuminated by LEDs, they can afford the higher light losses of the birdbath design and support having a reasonable amount of transparency to the the real world. Note, waveguides also tend to lose/wast a large amount of light as well. Avegant said that the current system was 50% transparent to the real world but that the could make it more (by wasting more light).

Very importantly, a birdbath optical design can be very cheap (on the order of only a few dollars) whereas the waveguides can cost many tens of dollars (reportedly Hololen’s waveguides cost over $100 each). The birdbath optics also can support a very wide field of view (FOV), something generally very difficult/expensive to support with waveguides. The optical quality of a birdbath is generally much better than the best waveguides. The downside of the birdbath compared to waveguides that it is bulkier and does not look as much like ordinary glasses.

What they would not say – Exactly How It Works

The one key thing they would not say is how they are supporting the change in focus between focal planes. The obvious way to do it would with some kind of electromechanical device such as moving focus or a liquid filled lens (the obvious suspects). In a recent interview, they repeatedly said that there were no moving parts and that is was “economical to make.”

What They are NOT Doing (exactly) – Mechanical Focus and Eye/Pupil Tracking

After meeting with Avegant at CES I decided to check out their recent patent activity and found US 2016/0295202 (‘202). It show a birdbath optics system (but with a non-see through curved mirror). This configuration with a semi-mirror curved element would seem to do what I saw. In fact, it is very similar to what Magic Leap showed in their US application 2015/0346495.

Avegant’s ‘202 application uses a combination of a “tuning assembly 700” (some form of electro-mechanical focus).

It also uses eye tracking 500 to know where the pupil is aimed. Knowing where the pupil is aimed would, at least in theory, allow them to generate a focus plane for the where the eye is looking and then an out of focus plane for everything else. At least in theory that is how it would work, but this might be problematical (no fear, this is not what they are doing, remember).

I specifically asked Avegant about the ‘202 application and they said categorically that they were not using it and that the applications related to what they were using has not yet been published (I suspect it will be published soon, perhaps part of the reason they are announcing now). They categorically stated that there were “no moving parts” and that the “did not eye track” for the focal planes. They stated that the focusing effect would even work with say a camera (rather than an eye) and was in no way dependent on pupil tracking.

A lesson here is that even small companies file patents on concepts that they don’t use. But still this application gives insight into what Avegant was interested in doing and some clues has to how the might be doing it. Eliminate the eye tracking and substitute a non-mechanical focus mechanism that is rapid enough to support 3 to 6 focus planes and it might be close to what they are doing (my guess).

A Caution About “Demoware”

A big word of warning here about demoware. When seeing a demo, remember that you are being shown what makes the product look best and examples that might make it look not so good are not shown.

I was shown three short demos that they picked, I had no choice. I could not pick my own test cases.I also don’t know exactly the mechanism by which it works, which makes it hard to predict the failure mode, as in what type of content might cause artifacts. For example, everything I was shown was very slow moving. If they are using sequential focus planes, I would expect to see problems/artifacts with fast motion.

Avegant’s Plan for Further Development

Avegant is in the process of migrating away from requiring a big PC and onto mobile platforms such as smartphones. Part of this is continuing to address the computing requirement.

Clearly they are going to continue refining the mechanical design of the headset and will either get rid of or slim down the cables and have them go to a mobile computer.  They say that all the components are easily manufactureable and this I would tend to believe. I do wonder how much image data they have to send, but it appears they are able to do with just two HDMI cables (one per eye). It would seem they will be wire tethered to a (mobile) computing system. I’m more concerned about how the image quality might degrade with say fast moving content.

They say they are going to be looking at other (than the birdbath) combiner technology; one would assume a waveguide of some sort to make the optics thinner and lighter. But going to waveguides could hurt image quality and cost and may more limit the FOV.

Avegant is leveraging the openness of Unity to support getting a lot of content generation for their platform. They plan on a Unity SDK to support this migration.

They said they will be looking into alternatives for the DLP display, I would expect LCOS and OLED to be considered. They said that they had also thought about laser beam scanning but their engineers objected to trying for eye safety reasons; engineers are usually the first Guinea pigs for their own designs and a bug could be catastrophic. If they are using time sequential focal planes which is likely, then other technologies such as OLED, LCOS or Laser Beam Scanning cannot generate sequential planes fast enough to support that more than a few (1 to 3) focal planes per 1/60th of a second on a single device at maximum resolution.

How Important is Vergence/Accomodation (V/A)?

The simple answer is that it appears that Magic Leap raised $1.4B by demoing it. But as they say, “all that glitters is not gold.” The V/A conflict issue is real, but it mostly affects content that virtually appears “close”, say inside about 2 meters/6 feet.

Its not clear that for “everyday use” there might be simpler, less expensive and/or using less power ways to deal with V/A conflict such as pupil tracking. Maybe (don’t know) it would be enough to simply change the focus point when the user is doing close up work rather than have multiple focal planes presented to the eye simultaneously .

The business question is whether solving V/A alone will make AR/MR take off? I think the answer to this is clearly no, this is not the last puzzle piece to be solved before AR/MR will take off. It is one of a large number of issues yet to be solved. Additionally, while Avegant says they have solved it economically, what is economical is relative. It still has added weight, power, processing, and costs associated with it and it has negative impacts on the image quality; the classic “squeezing the balloon” problem.

Even if V/A added nothing and cost nothing extra, there are still many other human factor issues that severely limit the size of the market. At times like this, I like to remind people the the Artificial Intelligence boom in the 1980s (over 35 years ago) that it seemed all the big and many small companies were chasing as the next era of computing. There were lots of “breakthroughs” back then too, but the problem was bigger than all the smart people and money could solve.

BTW, it you want to know more about V/A and related issues, I highly recommend reading papers and watching videos by Gordon Wetzstein of Stanford. Particularly note his work on “compressive light field displays” which he started working on while at MIT. He does an excellent job of taking complex issues and making them understandable.

Generally Skeptical About The Near Term Market for AR/MR

I’m skeptical that with or without Avegant’s technology, the Mixed Reality (MR) market is really set to take off for at least 5 years (an likely more). I’ve participated in a lot of revolutionary markets (early video game chips, home/personal computers, graphics accelerators, the Synchronous DRAMs, as well as various display devices) and I’m not a Luddite/flat-earther, I simply understand the challenges still left unsolved and there are many major ones.

Most of the market forecasts for huge volumes in the next 5 years are written by people that don’t have a clue as to what is required, they are more science fiction writers than technologist. You can already see companies like Microsoft with Hololens and before them Google with Google Glass, retrenching/regrouping.

Where Does Avegant Go Business Wise With this Technology?

Avegant is not a big company. They were founding in in 2012. My sources tell me that they have raise about $25M and I have heard that they have only sold about $5M to $10M worth of their first product, the Avegant Glyph. I don’t see the Glyph ever as being a high volume product with a lot of profit to support R&D.

A related aside: I have yet to see a Glyph “in the wild” being using say on an airplane (where they would make the most sense). Even though the Glyph and other headsets exist, people given a choice still by vast percentages still prefer larger smartphones and tablets for watching media on the go. The Glyph sells for about $500 now and is very bulky to store, whereas a tablet easily slips into a backpack or other bag and the display is “free”/built in.

But then, here you have this perhaps “key technology” that works and that is doing something that Magic Leap has raised over $1.4 Billion dollars to try and do. It is possible (having not thoroughly tested either one), that Avegant’s is better than ML’s. Avegant’s technology is likely much more cost effective to make than ML’s, particularly if ML’s depends on using their complex waveguide.

Having not seen the details on either Avegant’s or ML’s method, I can’t say which is “best” both image wise and in terms of cost, nor whether from a patent perspective, whether Avegant’s is different from ML.

So Avegant could try and raise money to do it on their own, but they would have to raise a huge amount to last until the market matures and compete with much bigger companies working in the area. At best they have solved one (of many) interesting puzzle pieces.

It seems obvious (at least to me) that more likely good outcome for them would be as a takeover target by someone that has the deep pockets to invest in mixed reality for the long haul.

But this should certainly make the Magic Leap folks and their investors take notice. With less fanfare, and a heck of a lot less money, Avegant has as solution to the vergence/accommodation problem that ML has made such a big deal about.

ODG R-8 and R-9 Optic with a OLED Microdisplays (Likely Sony’s)

ODG Announces R-8 and R-9 OLED Microdisplay Headsets at CES

It was not exactly a secret, but Osterhout Design Group (ODG) formally announce their new R-8 headset with dual 720p displays (one per eye) and R-9 headset with dual 1080p displays.  According to their news release, “R-9 will be priced around $1,799 with initial shipping targeted 2Q17, while R-8 will be less than $1,000 with developer units shipping 2H17.

Both devices use use OLED microdisplays but with different resolutions (the R-9 has twice the pixels). The R-8 has a 40 degree field of view (FOV) which is similar to Microsoft’s Hololens and the R-9 has about a 50 degree FOV.

The R-8 appears to be marketed more toward “consumer” uses with is lower price point and lack of an expansion port, while ODG is targeting the R-9 to more industrial uses with modular expansion. Among the expansion that ODG has discussed are various cameras and better real world tracking modules.

ODG R-7 Beam Splitter Kicks Image Toward Eye

With the announcement comes much better pictures of the headsets and I immediately noticed that their optics were significantly different than I previously thought. Most importantly, I noticed in the an ODG R-8 picture that the beam splitter is angled to kicks the light away from the eye whereas the prior ODG R-7 had a simple beam splitter that kicks the image toward the eye (see below).

ODG R-8 and R-8 Beam Splitter Kicks Image Away From Eye and Into A Curved Mirror

The ODG R-8 (and R-9 but it is harder to see on the available R-9 pictures) does not have a simple beam splitter but rather a beam splitter and curve mirror combination. The side view below (with my overlays of the outline of the optics including some that are not visible) that the beam splitter kicks the light away from the eye and toward partial curved mirror that acts as a “combiner.” This curve mirror will magnify and move the virtual focus point and then reflects the light back through the beam splitter to the eye.

On the left I have taken Figure 169 from ODG’s US Patent 9,494,800. Light from the “emissive display” (ala OLED) passes through two lenses before being reflected into the partial mirror. The combination of the lenses and the mirror act to adjust the size and virtual focus point of the displayed image. In the picture of the ODG R-8 above I have taken the optics from Figure 169 and overlaid them (in red).

According to the patent specification, this configuration “form(s) at wide field of view” while “The optics are folded to make the optics assembly more compact.”

At left I have cropped the image and removed the overlay so you can see the details of the beam splitter and curved mirror joint.  You hopefully can see the seam where the beam splitter appears to be glued to the curved mirror suggesting the interior between the curved mirror and beam splitter is hollow. Additionally there is a protective cover/light shade over the outside of the curved mirror with a small gap between them.

The combined splitter/mirror is hollow to save weight and cost. It is glued together to keep dust out.

ODG R-6 Used A Similar Splitter/Mirror

I could not find a picture of the R-8 or R-9 from the inside, but I did find a picture on the “hey Holo” blog that shows the inside of the R-6 that appears to use the same optical configuration as the R-8/R-9. The R-6 introduced in 2014 had dual 720p displays (one per eye) and was priced at $4,946 or about 5X the price of the R-8 with the same resolution and similar optical design.  Quite a price drop in just 2 years.

ODG R-6, R-8, and R-9 Likely Use Sony OLED Microdisplays

Interestingly, I could not find anywhere were ODG says what display technology they use in the 2014 R-6, but the most likely device is the Sony ECX332A 720p OLED microdisplay that Sony introduced in 2011. Following this trend it is likely that the ODG R-9 uses the newer Sony ECX335 1080p OLED microdisplay and the R-9 uses the ECE332 or a follow-on version. I don’t know any other company that has both a 720p and 1080p OLED microdisplays and the timing of the Sony and ODG products seems to fit. It is also very convenient for ODG that both panels are the same size and could use the same or very similar optics.

Sony had a 9.6 micron pixel on a 1024 by 768 OLED microdisplay back in 2011 so for Sony the pixel pitch has gone from 9.6 in 2011 to 8.2 microns on the 1080p device. This is among the smallest OLED microdisplay pixel pitches I have seen but still is more than 2x linearly and 4x in area bigger than the smallest LCOS (several companies have LCOS pixels pitches in the 4 micron or less range).

It appears that ODG used an OLED microdisplay for the R-6 then switched (likely for cost reasons) to LCOS and a simple beam splitter for the R7 and then back to OLEDs and the splitter/mirror optics for the R-8 and R-9.

Splitter/Combiner Is an Old Optic Trick

This “trick” of mixing lenses with a spherical combiner partial mirror is an old idea/trick. It often turns out that mixing refractive (lenses) with mirror optics can lead to a more compact and less expensive design.

I have seen a beam splitter/mirror used many times. The ODG design is a little different in that the beam splitter is sealed/mated to the curved mirror which with the pictures available earlier make it hard to see. Likely as not this has been done before too.

This configuration of beam splitter and curve mirror even showed up in Magic Leap applications such as Fig. 9 from 2015/0346495 shown at right. I think this is the optical configuration that Magic Leap used with some of their prototypes including the one seen by “The Information.

Conclusion/Trends – Turning the Crank

The ODG optical design while it may seem a bit more complex than a simple beam splitter, is actually probably simpler/easier to make than doing everything with lenses before the beam splitter. Likely they went to this technique to support a wider FOV.

Based on my experience, I would expect that ODG optical design will be cleaner/better than the waveguide designs of Microsoft’s Hololens. The use of OLED microdisplays should give ODG superior contrast which will further improve the perceived sharpness of the image. While not as apparent to the casual observer, but as I have discussed previously, OLEDs won’t work with diffractive/holographic waveguides such as Hololens and Magic Leap are using.

What is also interesting that in terms of resolution and basic optics, the R-8 with 720p is about 1/5th the price of the military/industrial grade 720p R-6 of about 2 years ago. While the R-9 in addition to having a 1080p display, has some modular expansion capability, one would expect there will be follow-on product with 1080p with a larger FOV and more sensors in a price range of the R-8 in the not too distant future and perhaps with integration of the features from one or more of the R-9’s add-on modules; this as we say in the electronics industry, “is just a matter of turning the crank.”

Everything VR & AR Podcast Interview with Karl Guttag About Magic Leap

With all the buzz surrounding Magic Leap and this blog’s technical findings about Magic Leap, I was asked to do an interview by the “Everything VR & AR Podcast” hosted by Kevin Harvell. The podcast is available on iTunes and by direct link to the interview here.

The interview starts with about 25 minutes of my background starting with my early days at Texas Instruments. So if you just want to hear about Magic Leap and AR you might want to skip ahead a bit. In the second part of the interview (about 40 minutes) we get into discussing how I went about figuring out what Magic Leap was doing. This includes discussing how the changes in the U.S. patent system signed into law in 2011 with the America Invents Act help make the information available for me to study.

There should be no great surprises for anyone that has followed this blog. It puts in words and summarizes a lot that I have written about in the last 2 months.

Update: I listen to the podcast and noticed that I misspoke a few times; it happens in live interviews.  An unfathomable mistake is that I talked about graduating college in 1972 but that was high school; I graduated from Bradley University with a B.S. in Electrical Engineering in 1976 and then received and MSEE from The University of Michigan in 1977 (and joined TI in 1977).  

I also think I greatly oversimplified the contribution of Mark Harward as a co-founder at Syndiant. Mark did much more than just have desigeners, he was the CEO, an investor, and and the company while I “played” with the technology, but I think Mark’s best skill was in hiring great people. Also, Josh Lund, Tupper Patnode, and Craig Waller were co-founders. 

 

Magic Leap: Are Parts of Their NDAs Now Unenforceable?

mousy-responseRony Abovitz’s tweet about “mousy tech bloggers” and one of its responses made me realize something I was taught way back about NDA and intellectual property. It is summarized well (with my bold emphasis) in the article, “What You Probably Don’t Know About Non Disclosure Agreements”:

Remember that if you have 99 people sign an NDA and 1 person doesn’t, that person can publish your idea in the Wall Street Journal – and to add insult to injury, when they do, the other NDAs all become invalid since they only apply to confidential information.

Reed Albergotti with “The Information,” was shown demos previously were not open to the public and as best I am aware did not have an NDA or other confidential agreement. Also David M. Ewalt of Forbes Magazine wrote on Reddit:

I didn’t sign an NDA, but I agreed not to reveal certain proprietary details”

So when Arghya Sur  (copied above) in his response asked Rony Abivutz to “publicly reveal and demos (sic)”. So I’m left wondering what is and what is not confidential now at Magic Leap? Has Magic Leap inadvertently already done what Arghya Sur requested? Has Magic Leap at least caused some people to be released from some parts of their NDAs?

Disclaimer: I am not a lawyer and my understanding is that this is contract issue based on the laws for the state(s) that governed the NDAs in question. Also, I have not seen ML’s NDAs nor do I know what they cover. There are likely severability clauses meant to limit the damage if there is a breach of some parts.

And it might be even worse. As I remember it, if a company is generally sloppy in their handling and protecting of what they tell people is confidential material, then you can’t enforce your confidential/NDA agreements. The principle is, how can you expect people to know what is “really confidential” from what is “only marked confidential?”

And Rony Abovitz is not just anybody at Magic Leap, he is CEO and and met with the reporters and presumably had some idea as to what they were being shown. This also goes to why you should not tweet about “mouse tech bloggers” if you are a CEO, it makes them ask questions.

I would appreciate if those with expertise  in this area would weigh-in  with your comments. Please don’t give any legal advice to anyone, but rather let people know how you what you were taught about handling NDA material.

BTW

I am always amused and a little shocked when I seen slides at open conferences with “Confidential” market on them. I was taught to NEVER do. If the material is not longer confidential, then remove it from the slides. You probably will not get the “confidential death sentence” for doing it once, but it should not become routine or the company might find all their confidential agreements unenforceable.