The tech to build the holodeck

The tech to build the holodeck

Source: The Verge

When I visited my elderly mom in Germany recently, I realized it could be one of the last times I see her in the cozy little house she has called home for more than two decades. So I did what anyone would do: I busted out my phone and took lots of photos of the place to preserve as many memories as possible: the warm fireplace; the shelves full of familiar books; the rickety old garden bench up front that everyone signed during a special birthday celebration many years ago.

Then, I tried something else. I opened up Scaniverse, a 3D scanner app from Pokémon Go maker Niantic, and captured some of those things as 3D objects, crouching and tiptoeing my way around them as I slowly moved my phone to record every angle and inch. The results were a bit imperfect around the edges, but they still felt profound. When I opened the scans up later, both on my phone and with a VR headset, I was able to look at that weathered garden bench from all angles, as if I was standing right in front of it. The experience touched me emotionally in ways I wasn’t prepared for.

That experience was possible thanks to Gaussian splatting, a novel method of 3D capture that was invented less than two years ago and is already taking the tech industry by storm. Both Niantic and Google are using it to build out their respective mapping products; Snap has added support for splats — which is what objects captured with Gaussian splatting are colloquially called — to its Lens Studio developer platform, and Meta wants to use Gaussian splatting to create a metaverse that looks just like the real world.

Tech companies are enamored by Gaussian splatting for its ability to photorealistically capture, and then digitally recreate, three-dimensional objects. It could soon allow anyone to scan entire rooms and change how creatives in Hollywood and beyond record 3D video. When combined with generative AI, it has the potential not only to preserve existing spaces but also to transport us to entirely new 3D worlds.

“It’s a huge game changer,” said AR / VR expert and investor Tipatat Chennavasin. As a cofounder and general partner of the Venture Reality Fund, Chennavasin has a financial interest in this technology’s success. As a geek and former 3D artist, he has fallen in love with it, likening it to the Star Trek holodeck, which allowed crew members to enter holographic 3D simulations of real and imaginary spaces. “We’re starting to get to a photoreal holodeck.”

Building a 3D map of the world, one splat at a time

Capturing objects in 3D, even on your phone, is not new. However, most prior efforts relied on polygons, the kind of triangular, cyberpunk-looking meshes you’ve seen if you’ve ever used a mobile AR app.

Polygon mesh-based 3D capture and reconstruction is good enough for basic objects with flat surfaces, but it can struggle with detailed textures and complex lighting. Objects captured this way often look plasticky and unreal, and 3D-captured humans always appear to have used way too much gel rather than having individual strands of hair. “It was promising at the time, but always had huge limitations,” Chennavasin said.

All of that changed in the summer of 2023, when a group of European scientists published a paper on something they called “3D Gaussian splatting.” Their approach to the problem was to ditch the meshes and instead capture 3D objects as a collection of fuzzy, translucent blobs, also known as Gaussians.

Each of these blobs is captured with exact information on its color, location, scale, rotation, and level of transparency — and when you combine millions of them, you get a much more detailed picture of a 3D object that also details how it looks from any given angle, thanks to all of this additional data. Using machine learning, they were able to capture objects with a lot more detail, in higher fidelity, and render them in real time without the need for heavy graphics-rendering rigs.

Experts in the field were immediately blown away by the results. “We finally have the chance to have true 3D that’s photo-real,” Chennavasin said. “It’s the JPEG moment for spatial computing.”

Niantic SVP of engineering Brian McClendon believes that Gaussian splats are the most profound advancement in the field of 3D graphics in more than 30 years. “We see it as a fundamental change,” he said. 

“We see it as a fundamental change.”

According to McClendon, Gaussian splatting is going to democratize 3D capture — and Niantic wants to be at the forefront of this change. After acquiring the Scaniverse app in 2021, Niantic added Gaussian splatting as a capture technology last year. In August, it launched a new version of Scaniverse that puts splatting front and center. In October, the company open sourced its own file format for splats. And in December, Scaniverse expanded to VR, enabling users to look at Gaussian splats in Meta’s Quest headsets.

Niantic has its own reasons for pushing splatting. Scaniverse started out as an app to capture personal memorabilia and other individual objects, but Niantic is now encouraging people to also scan statues, fountains, and other public points of interest. The company sees these scans as key components of the 3D map of the world it is building — the same map that powers Pokémon Go, Peridot, and future geospatial AR games and experiences. “We are very focused on the map, and scanning and reconstructing the outdoors,” McClendon said.

“We already have hundreds of thousands of these [types of scans] in Scaniverse right now,” McClendon said. “Hopefully, we’ll get to a million soon.”

Splats are changing 3D video capture

Gaussian splats aren’t just for capturing static content. Computer vision startup Gracia AI has been using the technology to record volumetric 3D videos, which can be viewed on Meta Quest headsets. One of those clips shows a chef preparing a meal, with viewers being able to look at the action from all angles in VR and even zoom in to observe his knife slicing through a glistening piece of raw salmon. 

Gracia recorded this video in a professional 3D capture studio, using an array of 40 cameras pointed at the chef from all angles. That’s how professionals have been recording holographic content for AR and VR experiences for years — but once again, the transition from polygons to Gaussian splats makes all the difference.

Previously, 3D video capture presented a series of visual challenges that led to strict dress codes for captured individuals: no busy patterns, nothing translucent, nothing loose and dangling that could result in weird artifacts. When Microsoft captured David Attenborough this way several years ago, it even had to glue his collar to his shirt and use obscene amounts of hairspray to literally avoid any loose ends that could mess up the capture process.

“It’s amazing how much creative flexibility you get with Gaussian splats.”

With Gaussian splats, all of those limitations are gone. “There are no restrictions with clothing, there are no restrictions with hair,” said Gracia cofounder and CEO Georgii Vysotskii, who counts Chennavasin’s Venture Reality Fund among his company’s investors. While previous-generation volumetric video capture required blinding amounts of light to eliminate any shadows, Gracia has been able to record scenes in almost complete darkness. “You can leave all the shadows, and use artistic lighting,” Vysotskii said. “It’s amazing how much creative flexibility you get with Gaussian splats.”

That’s not to say there aren’t still challenges. At the moment, Gaussian splatting clips still require 9GB of data per minute of video — too much for streaming or really anything beyond a short tech demo. Vysotskii said that the company is now working on reducing it to 2–3GB per minute, and 180-degree volumetric VR videos could require as little as 1GB of data per minute. He envisions these types of clips eventually replacing the recordings of instructors in VR workout apps like Supernatural or professional educational content because they allow users to look at instructions from all angles.

Meta’s ambitious plans for Gaussian splats

One of the most ambitious demos of Gaussian splats to date has been built by Meta. Hyperscape, which the company unveiled at its Meta Connect conference this fall, is an app for Meta’s Quest headsets that lets users explore photorealistic 3D renderings. The app launched with six scanned spaces, including five artist studios and a conference room on Meta’s campus that once served as Mark Zuckerberg’s office.

Hyperscape allows you to freely move around in these spaces, which is a fascinating experience with this kind of visual fidelity. You can browse the many oddities in the San Francisco studio of mixed media artist Dianne Hoffman, which includes countless dolls and a box labeled “snake skin and shells.” You can marvel at the extensive Porsche collection of visual artist Daniel Arsham and even look at the fern and trees outside the window of Zuck’s former office. The renderings feel so real that Meta felt compelled to include a warning not to lean on any of the depicted furniture.

At the moment, Hyperscape is not much more than a bespoke tech demo. However, Meta has big plans for Gaussian splats, as Meta Horizon OS and Quest VP Mark Rabkin told me at Meta Connect this fall. “Gaussian splats are already running for us on an engine that’s pretty much the Horizon engine,” Rabkin said, referring to Meta’s social VR platform. “So the path, technologically, to get it to run in a world is pretty short.”

Meta envisions splats as yet another tool for VR creators to build immersive worlds and experiences for Horizon Worlds. The company even has plans to eventually allow anyone to scan their own home and then upload a digital copy of it to the metaverse. “Definitely,” Rabkin said. “That’s what we’re working toward.”

“Do they have a path to scaling that? I don’t know.”

How long that work will take is unclear, and whether Horizon Worlds will survive in its current form until then is another question altogether. Meta declined to participate in follow-up interviews for this story, but Niantic’s McClendon cautioned not to underestimate the complexity of building a scanning tool like Hyperscape.

“They basically have produced a perfect view,” McClendon said. Meta likely combined multiple scans for each room and probably also did a good amount of manual editing and cleanup, he suggested. And since the resulting scans are too big to process in real time on a device, Meta is rendering them in the cloud and streaming them directly to headsets.

“That doesn’t scale, but it looks really good,” McClendon said. “Do they have a path to scaling that? I don’t know.”

A clear shot to the holodeck

The development of Gaussian splatting tech is advancing at a rapid pace. McClendon told me that the speed at which new scientific papers on the subject are coming out mirrors that of generative AI research. “Papers are getting published so fast right now,” he said. “The excitement is real.” And the tech they’re developing is being implemented quickly, Chennavasin said. “Or turned into startups.”

One of the areas ripe for a breakthrough is the combination of splats and AI. Generative AI could improve the capture and rendering of Gaussian splats, potentially allowing a company like Gracia AI to capture videos with far fewer cameras. At the same time, many more people capturing 3D objects and scenes will also dramatically increase the amount of high-quality training data for generative 3D video models.

“It’s not happening overnight. But it is a clear shot now.”

All this points toward a future in which everyday people will be able to generate photorealistic 3D spaces with AI prompts, Gaussian splat captures, or a mixture of both, and then enter those spaces with VR headsets or AR glasses.

“The killer app of XR is a multiplayer holodeck,” said Chennavasin. “Generative AI and Gaussian splats is how we create it at a visual fidelity that’s almost indistinguishable from reality. It’s not happening overnight. But it is a clear shot now.”

Such a future within reach raises the question: if you had a holodeck, what would you visit first? Photorealistic renditions of far-away places that you haven’t had a chance to travel to yet? Famous recording studios, museums, or libraries? Or, rather, fantastic worlds like medieval castles, dungeons, or Marvel movie sets?

For me, it may just be my mom’s cozy little house and that rickety garden bench.



Read Full Article