A new wearable device developed by scientists from Cornell University is able to capture a person's facial expressions using a sonar and reproduce them as a digital avatar. Avoiding cameras could mitigate privacy concerns.
EarIO, this is the name of the device, is hyper simple. It consists of an earphone with a microphone and a speaker on each side, and can be connected to any normal earphone. Loudspeakers play sound pulses outside the range of human hearing, and their echoes are picked up by microphones, just like sonar works.
The echo profiles change depending on facial expressions: for this reason, appropriately trained algorithms recognize the variations received from the sonar and translate them into images.
EarlO, sonar that “sees” things from sounds
“Thanks to the power of artificial intelligence, our algorithm builds intricate connections between muscle movement and facial expressions that humans cannot perceive,” he says Ke Li, one of the study's co-authors. “It can be used to extract very complex information: that of the entire front of the face.” The research was published in the journal Proceedings of the Association for Computing Machinery on Interactive, Mobile, Wearable and Ubiquitous Technologies. And I link it to you here.
The team tested EarIO sonar on 16 participants, running the algorithm on a regular smartphone. And the device was able to reconstruct facial expressions as a normal camera/video camera could. Background noises such as wind, talking or street noise did not interfere in the slightest with its ability to register faces.
Technology from 007
The researchers point out that sonar has several advantages over using one camera. Acoustic data consumes much less energy and processing capacity, allowing you to use smaller and lighter devices. Cameras can also collect a lot of additional personal information that users may not want to share, so sonar may be safer.
Of course, flying with my imagination I imagine a technology like this silently "stuck" in a normal earphone, and I think it can remotely transmit lip movements and expressions also for surveillance purposes. This time, however, I see more practical uses for it.
Which? First of all, those in the gaming sector: a practical way to replicate physical facial expressions on a digital avatar for games, virtual reality or the metaverse. Now the team is working to exclude other interferences, such as when the user turns his head, and simplify the training system for the AI algorithm.
We'll see. That is, to hear. In short, you understand.