Magic Leap's core technology revealed

Magic Leap's core technology revealed

This article is reproduced from the WeChat public account: Mr. Sai. Original author: Gu Xianfeng is a tenured professor at Stony Brook University, State University of New York, a visiting professor at the Qiu Chengtong Mathematical Sciences Center of Tsinghua University, and the founder of computational conformal geometry.

On February 2, Magic Leap officially announced that it had received approximately $794 million in funding led by Alibaba Group. Together with the previous round of $542 million led by Google, the company has completed a total of $1.34 billion in investment since the end of 2014, with a total valuation of approximately $4.5 billion.

This Florida-based startup has been in business for decades, with a luxurious lineup and a large scale, but it is extremely secretive about its core technology. Occasionally, a few demo videos are leaked, which amaze the world and cause an uproar. So, what is the shocking technical secret of Magic Leap? Here, Lao Gu makes a bold guess and tries to explore the essence of the technology under its confusing appearance.

Figure 1. A giant whale leaps out from the center of the basketball court!

Figure 2. Turn your office into a gaming battlefield!

In the field of computer graphics, the evolution of 3D scene rendering demonstration technology can be roughly divided into the following historical stages: pinhole camera, binocular stereo vision, light field, digital holography. In short, the representative work of pinhole camera demonstration technology is the early animated film "Final Fantasy", the representative work of binocular stereo vision is the 3D version of "Avatar", the representative work of light field is Magic Leap, and the representative work of digital holography technology is the scenes in "Star Wars".

Figure 3. Final Fantasy: Ray Tracing rendering, pinhole camera display technology.

Figure 4. 3D version of "Avatar", binocular stereo vision.

Figure 5. Magic Leap, augmented reality, light field technology.

Figure 6. Star Wars, digital holography.

Magic Leap has realized and popularized light field display technology, which is a real revolution in 3D scene display technology, and it is natural that it has received unprecedented investment. So, what is light field? Is this technology completely new? What is the historical context of its development? Are there other companies that started with light field technology? We will explain one by one in the discussion below.

Pinhole Camera

The ideal model of a traditional optical camera is a pinhole camera. In computer graphics, traditional rendering methods are based on this camera model. As shown in Figure 7, a ray is emitted from the optical center of the camera through each pixel of the imaging screen. Ray tracing uses the physical laws of geometric optics to calculate the color of this ray, which is the color of the corresponding pixel. Figure 8 shows a rendered image calculated using ray tracing. Here, we need a conceptual transformation. Each pixel is not a point, but a ray. This is the key to understanding the light field! In other words, a photo is a cluster of rays passing through the optical center. Final Fantasy was rendered using ray tracing.

Figure 7. Pinhole camera model in ray tracing.

Figure 8. A scene rendered using ray tracing.

Traditional display methods, such as screens, LCD/LED, are based on the traditional concept that each pixel is regarded as a point, and the color of the same pixel remains unchanged when viewed from different angles. In other words, this display method loses information about the direction of the ray.

Binocular Stereo Vision

Humans have two eyes. When viewing an object, each eye forms an image. The brain calculates the depth information of each point based on the subtle differences in the images of the two eyes, thus obtaining a three-dimensional sense. To imitate the human eye, we can use a dual-lens camera to obtain binocular three-dimensional photos.

Figure 9. Binocular stereo camera.

Figure 10. Binocular stereoscopic photos taken during the Apollo moon landing program.

Essentially, binocular stereo vision photos are two rays starting from two optical centers. The 3D version of "Avatar" was made based on this principle. Compared with a monocular camera, binocular stereo vision has double the time complexity and space complexity.

Light Field - Magic Box Explanation


Figure 11. Magic box explanation of Light Field.

Let's assume that a rabbit is covered in a glass box, and then we observe the rabbit through the glass box. From any point on the surface of the box, we send a ray in any direction in three-dimensional space. The color of this ray is determined by the rabbit and the lighting conditions. To represent the glass box, represents a unit vector, and a ray is represented by , the set of all rays is denoted as Each ray corresponds to a color, which we represent as a point in three-dimensional space. . Therefore, the light field is a mapping from ray space to color space. In other words, the light field is a vector-valued function defined in ray space: .

Suppose we remove the rabbit from the glass box, but this glass box is a magic box, and the light field information is perfectly preserved. When we observe this magic box, all the rays passing through one eye synthesize an image on the retina. We can freely change the distance and viewing angle, and the image of the rabbit on the retina changes naturally accordingly, and we can't even notice the disappearance of the rabbit. Therefore, with the magic box, we no longer need a real rabbit. This magic box is the light field of the rabbit.

In the field of optics, light field is an old concept. It was introduced to the field of computer graphics by Microsoft and Stanford scholars in 1996, and it has been 20 years since its development in 2016. Although people in academia have been researching and deepening it, it has only been in recent years that it has really had an impact in the industry. Magic%20Leap should be regarded as a pinnacle of Light%20Field theory in real-world applications.

Light field rendering: We can use the light field of the rabbit to replace the rabbit, and render and generate photos from various angles, so that we do not need to build the geometric model, texture model and lighting model of the rabbit. For large scenes, complex lighting conditions, or complex geometric models (such as plush toys), etc., the light field is simpler than the digital model of the physical object, or the light field is more realistic than the rendering result obtained by ray tracing, or more efficient, so we use the light field to render. This is the so-called image-based rendering method (%20Image%20Based%20Rendering%20). Historically, Microsoft once released a version of a light field-based game, similar to the island treasure hunt, where all the scenes were collected from real nature and were very realistic, but in the end it did not cause any response and ended in vain.

Light field acquisition: Light field is a function defined in ray space, which is 4-dimensional. Traditional pinhole cameras can only collect 2D ray clusters, so light field acquisition is inherently difficult. Early light field acquisition methods were simple and crude, using large-scale camera arrays, such as the 2D camera array shown in Figure 12. This type of light field camera is bulky and expensive, and cannot be popularized.

Figure 12. Stanford’s light field camera: 16x8 multi-camera array.

As digital camera technology matures, pinhole cameras are becoming smaller and smaller and can be densely integrated, thus reducing the size of light field cameras. However, the size of the lens cannot be reduced, as shown in Figure 13.

Fig. 13. Stanford Light Field Camera: Camera Array.

The real breakthrough comes from bionics. Many insects have compound eyes, which acquire light field information.

Fig. 14. Insect compound eyes: light field cameras.

Humans imitated insects and created lenses similar to compound eyes, as shown in Figure 15, with dozens of small lenses integrated on a large lens. With the improvement of optical technology, people have created thousands of tiny lenses integrated on a plastic film. Based on this idea, Stanford doctoral student Wu Ren founded the light field camera company Lytro.

Figure 15. Prototype of artificial compound eye made by Adobe.

Traditional cameras need to focus first and then take pictures. The slogan of Lytro camera is "take pictures first, then focus". Because of the light field information obtained by Lytro camera, users can synthesize 4D photos of different angles and depths from the 4D light field.

Figure 16. Lytro camera.

Wedding photography as shown in Figure 17: In the same light field photo, we can focus on the groom who is close to the camera; we can also focus on the bride who is far away from the camera.

Figure 17. Lytro wedding photo: The same light field photo can be focused on different areas. The left frame focuses on the groom; the right frame focuses on the bride.

Light field display The traditional display method, screen, LCD/LED, only retains the geometric information and color information of the intersection point where the ray passes through the screen, but does not retain the directional information of the ray. The screen is diffusely reflective, and all rays emitted from a certain point on the screen are of the same color, while light field display requires different rays starting from the same point to have different colors, as shown in Figure 18. Light field display is the core technology of Magic Leap.

Figure 18. Display mode comparison: the left picture is a traditional screen, where all rays passing through a point are the same color; the right picture is a light field display, where different rays passing through a point are different colors.

USC's light field display The University of Southern California proposed and produced a light field display device, as shown in Figures 19 and 20. There is a glass cabinet with four transparent sides. In the middle of the cabinet is a mirror with an angle of 45 degrees to the horizontal plane. A high-speed projector is installed on the top of the cabinet. The projector projects vertically downward, and the light is reflected by the mirror and then emitted horizontally. At the same time, the mirror rotates at high speed. A ghostly transparent human head is suspended in the air. When we walk around the cabinet, we can see all sides of the head, and the head winks at you.

Figure 19. USC Light field display, a floating human head.

Figure 20. USC Light field display for teleconferencing system.


Figure 21. USC Light Field display patent drawing.

FIG21 shows the principle of this light field display device. The 45-degree tilted mirror (114) is driven to rotate by the motor (115), and the graphics processor (130) generates an image and transmits it to the high-speed projector (120). The projector projects the image onto the mirror, which is then reflected and projected horizontally to all directions. In this way, after strict synchronization control, we can display a three-dimensional light field. This device is bulky and expensive, and the high-speed rotating mirror reduces the stability of the system. Any mechanical vibration will affect the light field display effect.

Magic Leap Light Field Display - Flashlight Explanation Magic Leap's core technology is a special light field display device: Fiber Optic Projector. The laser propagates in the optical fiber and is emitted at the end of the fiber, and the output direction is tangent to the fiber. By changing the shape of the fiber in three-dimensional space, especially changing the tangent direction at the fiber end, we can control the direction of the laser emission. This is like holding a flashlight and changing the position and direction of the flashlight to change the direction of the output light column. If we shake our wrist quickly, the light column emitted by the flashlight draws a cone in the air, and this cone hits a wall to form a circumference. By quickly changing the amplitude of the wrist shake, we can control the radius of the circumference, thereby obtaining a series of concentric circles, which cover a disk. If the color of the flashlight's light column changes, we draw a colored disk on the wall. In this way, by quickly shaking a flashlight, we get an image, or cover a cluster of rays. Suppose there are many people standing in different spatial positions, and each of them shakes a flashlight, then we get a light field. This is the principle of Magic Leap's light field display device: the fiber optic projector.


Figure 22. Magic Leap's flashlight.

Figure 22 shows Magic Leap's flashlight. The actuator (206) is equivalent to a person's wrist, and the optical fiber (208) is equivalent to the flashlight. The actuator causes the fiber tip to vibrate periodically, and the fiber tip spirally draws a series of concentric circles. The laser is output through the lens system, drawing a cluster of rays in the air. It is projected onto a plane to illuminate a disk. By synchronously changing the color and intensity, a fiber uses time-sharing technology to obtain an image, as shown in Figure 23.


Figure 23. One fiber produces one image using time-sharing technology.

In Magic Leap's fiber light projector, there are many optical fibers assembled into a two-dimensional array. Each fiber is equivalent to a pinhole camera, and the two-dimensional camera array generates a light field.

Advantages of light field display Compared with binocular stereo vision, light field display has many advantages. There are two ways for humans to obtain three-dimensional depth information, "shape from stereo" and "shape from focus". We use two eyes to look at the same object, and the same point in three-dimensional space is projected onto different pixels on the left and right retinas. Our human brain can reversely calculate the corresponding spatial rays through the pixels on the retina, thereby obtaining the intersection of the two rays and obtaining depth information. This process is "shape from stereo". When each of our eyes looks at an object, the brain automatically adjusts the curvature of the eye's lens so that the object is clearly imaged on the retina. Adjusting the muscle tension of the lens allows the brain to calculate the depth information of the object, which is the so-called "shape from focus". When watching the 3D version of "Avatar", we only used "shape from stereo", and the focal length of the eyes has been fixed because the distance from the eyes to the screen does not change, so there is no "shape from focus" process. However, after a long evolution of humans, these two processes are naturally closely linked. Artificially separating them will cause dizziness. On the contrary, if we use light field display technology, we need both “shape from stereo” and “shape from focus” at the same time, so you won’t feel dizzy when watching. Light field display technology is more natural and healthy.

Challenges of light field display As the beginning of a revolution, Magic Leap's technology faces many challenges. The most direct one is that traditional display technology only needs to calculate a two-dimensional slice in a four-dimensional light field, while light field display needs to calculate the entire four-dimensional light field, and its computational complexity increases by several orders of magnitude, which is one of the technical bottlenecks. At the same time, precise control of mechanical components so that each fiber vibrates stably and naturally, and the vibration pattern must be synchronized with data transmission, and this vibration is not affected by external noise, also requires incredible technology.

It has been 20 years since the concept of digital holographic light field was proposed to the investment frenzy of Magic Leap, and the development history of digital holographic technology is even longer. Light field is essentially geometric optics, while digital holography is wave optics. At present, digital holographic technology is becoming more and more mature. With the invention of blue laser, color digital holographic technology has become possible. The current bottleneck of development is that the amount of calculation is huge, far exceeding the calculation of light field, and the second is that a special kind of crystal is required in digital holographic display, and the refractive index of each pixel can be controlled by voltage. At present, this kind of optical device is still expensive and small in size. We believe that with the widespread acceptance of light field technology, digital holographic technology will also develop rapidly.

The inspiration of light field technology The historical development of light field technology shows us that disruptive technological revolutions often originate from basic science and non-commercial academic circles. It often takes decades for a technology to mature in academia and become a force to be reckoned with in the business world. Magic Leap's technological breakthrough came from the use of endoscopy technology, which shows the importance of cross-border scientific research.

I look forward to the day when all TV and movies are shot with light field cameras, and the audience can dynamically choose the viewing angle at will. Maybe this day will take another twenty years, or maybe only three to five years. I believe that in the near future, all photos on Taobao will be replaced by light field photos, and the Magic Leap helmet will become the standard for every online shopper. ( This article was first published in "Lao Gu Talks Geometry" and is reprinted with the author's permission.)

As a winner of Toutiao's Qingyun Plan and Baijiahao's Bai+ Plan, the 2019 Baidu Digital Author of the Year, the Baijiahao's Most Popular Author in the Technology Field, the 2019 Sogou Technology and Culture Author, and the 2021 Baijiahao Quarterly Influential Creator, he has won many awards, including the 2013 Sohu Best Industry Media Person, the 2015 China New Media Entrepreneurship Competition Beijing Third Place, the 2015 Guangmang Experience Award, the 2015 China New Media Entrepreneurship Competition Finals Third Place, and the 2018 Baidu Dynamic Annual Powerful Celebrity.

<<:  The gap between Chinese and American TV series from the second season of Fargo

>>:  Is it time for the unformed third-party content platforms upstream and downstream to enter the VR stage?

Recommend

Why are Xiaohongshu notes not included?

The inclusion of Xiaohongshu notes has always bee...

Consumer Reports compares the fuel efficiency of nearly 50 SUVs

SUVs are widely considered to be versatile vehicl...

The Red and the Black in the Flashing World

In the virtual world of the Internet, people'...

Design like a psychologist! 5 practical tips to control user behavior

Have you tried various design strategies and meth...

iOS - Solution to NSTimer circular reference

Occurrence scenario In Controller B there is a NS...

Getting Started with WatchKit: Creating a Simple Guessing Game

Editor's note: As you know, Apple has include...

What are the Android emulator detection methods?

The general method of detecting the Android emula...