Nvidia has demonstrated an amazing new artificial intelligence algorithm that can automatically transform some photos into a realistic 3D scene.
The algorithm, which is based on a deep learning neural network, was able to create a 3D scene from just 12 shots, rendering in just a few seconds. The results are still far from perfect but stunning: and show the promise of future applications in areas such as video editing and rendering of faces, people or landscapes.
Nvidia has released a video demonstrating the algorithm in action - check it out for yourself.
Instant NERF, Nvidia's 3D "magic".
The tool developed by Nvidia is known as Instant NERF, derived from “neural radiance fields,” a technique from Google Research and UC San Diego University of California that was launched in 2020.
If you really want to know more take a look here (English text). If, on the other hand, you are satisfied with what I understand: in essence, the technique couples the color and light intensity data from some 2D images to produce a 3D scene. In fact, in addition to photographs, the system needs information on the position of the camera.
Researchers have been working on improving this type of model from 2D to 3D for a couple of years, adding more complexity to renderings, while simultaneously reducing turnaround times. In fact, Nvidia claims that its new rendering method with Instant NeRF is perhaps the fastest ever created: the procedure goes from taking a few minutes to being completed practically instantly.
Possible fields of application
As the technique becomes faster and easier to implement, it could be used for all types of businesses, Nvidia says in a blog post. describing the work.
Instant NeRF can be used to generate avatars or environments for virtual worlds, capture video conference participants and their surroundings in 3D, or recreate settings for 3D digital maps, according to Isha Salyan by Nvidia.
The technology could be used to teach robots and self-driving cars how to identify the size and shape of real-world objects by taking 2D photographs or videos of them. It will also be very useful in architecture and entertainment. It will help startups (including Italian ones, like this) to create faster digital representations of real environments that creators can modify and develop in a flash.