Challenges & Tradeoffs on the Road to AR
February 2018 Venturebeat Op-ed piece (original here, text below)
Anyone involved in virtual reality over the course of the past few years, whether as a developer of VR, as a user of VR, or simply tracking the industry’s progress, will agree there’s a word they’ve heard a few times too many: Holodeck. The well-trod Star Trek concept has become a threadbare metaphor for a supposed end-point for VR technology.
While aspirational visions serve a purpose, they can also do us disservice. The reality is that we are a very long way from that Holodeck vision and that’s OK. VR is already serving many useful purposes with near-term solutions that don’t attempt to fool all our senses to the point of a complete suspension of disbelief. Most of the industry, it seems, has come to accept this, as have most VR users. We have, collectively, come to terms with the fact that great product solutions can exist in the near term, that deliver some portion of the Holodeck promise, while leaving other portions to the fictions of Star Trek and other sci-fi.
It is surprising then, when looking at augmented reality [1], that so many believe in the promise of a “Holodeck of AR” — sleek and stylish glasses delivered via hardware and software magic that rather than bringing us to any imaginable universe, instead bring any imaginable augmentation of the senses to our real world. Moreover, many believe this is deliverable in the near-term time horizon.
While solutions spanning the immersive technologies domain (AR, VR) will share dependence on common underlying technologies, augmented reality is in many ways a harder problem. AR can be thought of as a whole bouquet of thorny technical problems, each of which is its own very deep rabbit hole.
As with VR, AR involves an input-output loop that needs to execute sufficiently quickly to fool the conscious and subconscious to a degree where the results seem congruous with the surrounding world and the user’s sense of what seems natural. What’s more, in order to dovetail with the surrounding world, the solution may need to communicate with and draw from surrounding information sources. The sophistication of the processing that the solution may need to perform may vary by use case. And the solution needs to be embodied in something that a user can wear or carry in a manner suitable to their situation.
This is where the challenge becomes apparent. The sheer number of possible inputs and outputs that one can imagine, the depth of each that might be required, the sophistication of the processing that may be required for a given task, and the desired attributes for the embodiment of that solution (price, form factor, etc), make this a boundless problem.
Attributes of AR
For a sampling of the technical challenges facing AR, see the Illustration below, which attempts to present the wide variety of attributes that an AR solution may embody. Titled the ‘Attributes of Augmented Reality [2], this — while almost certainly incomplete — is meant to illustrate the breadth of challenging problems to address. I’ve divided them into four main areas:
Sensing: Seeing, hearing, sampling, and otherwise deriving the state of the world surrounding the user.
Processing: Making sense of all of that data, what it means in the context of the computational tasks, simulations and/or and applications at hand, and making decisions about what to do next.
Augmenting: Taking the output of this processing and presenting it back to the user’s senses in a way that augments their sense of their environs.
Embodying: The attributes of the physical manifestation of the device or devices that deliver this solution.
This is an admittedly over-simplified division; and the sub-categories within each area are only a subset, to which many working within the field could add. This, in a way, is the point: Solutions that do ALL of these things, let alone do them well, cheaply, and unobtrusively, are a long way off.
Even more challenging still is the number of problems in the space that are ones for which solutions do not yet even exist. I like to think of the problems as falling within three distinct domains:
Problems at the intersection of power, performance, and time. For those of us that work in Silicon Valley, these are the easiest to understand. For known solutions to problems, they are simply a matter of “how long before Moore’s Law allows us to do this in real-time, within a certain power envelope?”
Problems requiring breakthroughs in science. This is a more challenging category of problems, requiring breakthroughs in limitations of existing technologies — or more often — multiple breakthroughs. Examples in recent years include image-based inside-out 6DOF-tracking, or Waveguide display technologies. Lightfield displays are an example that feels further out on the edge of today’s R&D. While predicting when these problems will be solved is much harder, there’s a certain faith that people in the field have enough smart people in labs around the world working on these problems to make progress in solving them.
Problems requiring breakthroughs in design, user experience, and social norms. I sometimes encounter folks who believe that if we tackle problems in the two above categories, the third category of problems will be resolved in short order. Personally, I think this is the hardest category of the three. We can look at many technology transitions and see that there was a sort of “maximum rate of absorption” at which the design community could adapt to the new paradigm (e.g. the half-decade of attempts at 3-finger swipes, swirly gestures, and other touchscreen UI attempts before the dust settled on what most apps use today on smartphones).
Similarly, there’s an analogous societal component — it takes time for people to get used to intrusions of technology (real or perceived) on their lives. (Google Glass learned this lesson painfully.)
Specialization vs Jack-of-all-trades
Until a point in the far future where we can deliver all of the attributes of AR at extremely high quality, inexpensively, and seamlessly, we’re going to see interim solutions that are forced to make tradeoffs between them. This is a Good Thing. I hold a strong conviction that the path to success in this space is in doing fewer things extremely well, not many things in a compromised fashion
It’s likely we’ll see AR solutions that tackle particular problems in point solution devices. We’ll see solutions that make compromises on some attributes in order to exceed expectations on others. We’ll see solutions that complement existing screens rather than replace them. And like with VR, we’ll see solutions that leverage the power of existing devices (PCs, game consoles, smartphones, etc.).
Fostering an Environment for Progress
If we take the view that solutions will need to decide on different tradeoffs for different optimal solutions for particular problems, customer segments or form factors — and that we want many solutions to make attempts at different flavors of AR solutions — then how to encourage this?
The first step is to acknowledge that the “AR Holodeck” is not likely to arrive in the near term, and that interim, specialized solutions are not only OK, but may be preferred. Second is to foster an environment that allows a multitude of solutions to materialize — through open platforms and open standards. Finally, the industry requires collaboration — as entrants solve a problem in one domain, to share that solution with others to allow cross-pollination. Through these kinds of efforts, we may get our “holodeck of AR” eventually, but we’ll have been using AR for years already by the time that happens.
Kim Pallister manages the VR Center of Excellence at Intel. The opinions expressed in this article are his own and not necessarily represent the view of Intel Corporation
[1] I’m going to avoid getting into the AR/MR nomenclature debate. For purposes of this article and the illustrative Attributes of AR poster – I’m covering the full spectrum of cases where a solution would supplement a user’s environment with spatial elements, regardless of how seamlessly or realistically the solution attempts to integrate them into the environment.
[2] To give credit where it’s due: I owe thanks to the folks at Ziba Design for helping lay out the design in a far more cohesive way than I originally had it on my whiteboard. Also, a huge thanks to John Manoogian III for his creation of the *brilliant* Cognitive Bias Codex, from which I took inspiration.