After several months of reported talks between the companies, Apple confirmed the acquisition of PrimeSense last weekend for a reported $360M. PrimeSense is an Israel-based company known for their structured light technology (which they call “light coding”), which was used by Microsoft in its first Kinect. Interestingly, Microsoft switched to ToF (time of flight) technology for the new Kinect released recently.
Microsoft acquired ToF technology developers 3DV and Canesta in 2009 and 2010, gaining ToF know-how and IP. Thus, when the Kinect was first demonstrated at E3 2009, it was surprising that Microsoft used the structured light approach of PrimeSense instead of ToF.
In 2013, gesture sensing technologies have been an active area for acquisitions, with Intel acquiring Omek and Google acquiring Flutter; both deals were reportedly worth $40M. The fact that the valuation of PrimeSense is much higher may indicate that gesture sensing has become as critical a strategy for future user interface as touch screens are for mobile devices.
Our Gesture Sensing Control for Smart Devices Report analyzes pros and cons of these technologies and their applications from 2D to 3D depth. Gesture sensing of 3D depth without a controller is the standard for game consoles such as Microsoft Kinect for Xbox and new PS Camera for PlayStation 4. Clearly, a future Apple TV is the logical platform for Apple to adopt PrimeSense’s structured light technology.
Structured light technology requires a few key elements. The first is an active light source: an IR (infrared) projector. In Microsoft’s first generation Kinect the IR projector utilizes a laser diode and a diffractive optical element (DOE) from JDSU. Second, an IR CMOS image sensor is needed to capture images of the deformation. Third, a powerful backend SoC, such as PrimeSense’s PS1080 (ARM Cortex A8 or A9 level) is necessary for converting deformation information into meaningful 3D range images.
Given the need for these key parts, it is quite difficult to fit a structured light system into a mobile device, such as an iPhone, iPad, or even a MacBook Pro. The current BOM cost could be $40-50, which would also put it out of range for these applications. Finally, the SoC power consumption would likely be too high. Consequently, the most logical platform is Apple TV. Samsung has used gesture sensing by built-in webcam for their smart TVs since 2012, but its performance and utility are not quite acceptable yet, and they use 2D based instead of 3D depth sensing. If Apple uses structured light in a future TV product, it would be the first to enable 3D gesture sensing.
Apple would likely encourage developers to design apps suitable for gesture sensing on Apple TV, enabling it to expand the iPad to the big screen in the living room, moving beyond video. However, this would be more likely in a set-top box Apple TV than in a TV. Attractive flat panel TV designs do not have enough space to accommodate structured light elements (light source, CMOS image sensor and SoC PCB assembly); in addition, the use of active light requires the source and sensor to be in the front of the set. (There are other indications that Apple has pulled back on producing a traditional TV.)
This makes a future Apple TV more likely to look like the Mac mini approach, in which there is a set-top box (at a higher price than the existing $100) and a bundle with a high performance display. Either way, this suggests a “TV” experience beyond broadcasting, and even beyond video, just as making a phone call is only one of the uses of a smartphone. The next stage of the smart TV may combine video on demand with user interfaces far beyond remote control and unique applications. Apple acquisition of PrimeSense is another indication that gesture sensing has a role to play in both of these new directions.