Image: Christine Daniloff/MIT
Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.
In other experiments, they extracted useful audio signals from videos of aluminum foil, the surface of a glass of water, and even the leaves of a potted plant. The researchers will present their findings in a paper at this year’s Siggraph, the premier computer graphics conference.
“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”
Joining Davis on the Siggraph paper are Frédo Durand and Bill Freeman, both MIT professors of computer science and engineering; Neal Wadhwa, a graduate student in Freeman’s group; Michael Rubinstein of Microsoft Research, who did his PhD with Freeman; and Gautham Mysore of Adobe Research.
Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That’s much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.
Commodity hardware
In other experiments, however, they used an ordinary digital camera. Because of a quirk in the design of most cameras’ sensors, the researchers were able to infer information about high-frequency vibrations even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn’t as faithful as that with the
high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers’ voices, their identities.
Read more . . .
The Latest on: Reconstructing audio from video
via Google News
The Latest on: Reconstructing audio from video
- The communication compliance consequences of hybrid workingon May 17, 2022 at 9:17 am
Regulators in the United States and the United Kingdom are continuing to focus on the compliance consequences and challenges of remote or hybrid working.
- Ukraine Latest: Peace Talks Put on Ice; Mariupol Fighters Exiton May 16, 2022 at 10:28 pm
The US is preparing a military aid package for India that aims at increasing security ties and reducing its reliance on Russian weapons, people familiar with the matter said.Most Read from ...
- An independent probe points to Israeli gunfire in the death of a journaliston May 16, 2022 at 8:46 am
One open-source research team said its initial findings lent support to Palestinian witnesses who said Al Jazeera journalist Shireen Abu Akleh was killed by Israeli fire.
- Independent Probe Points to Israeli Fire in Journalist Deathon May 16, 2022 at 5:55 am
As Israel and the Palestinians wrangle over the investigation into the killing of Al Jazeera journalist Shireen Abu Akleh, several independent groups have launched their own probes.
- Abingdon Cinemall announces ‘massive reconstruction'on May 11, 2022 at 6:11 pm
The movie theater announced on social media that it will undergo “massive reconstruction ... projectors, and sound systems. The theater’s arcade will also be expanded. For the latest news, weather, ...
- 6 art-school stars from around Boston to watch in 2022on May 11, 2022 at 8:43 am
This year’s artists are from Boston University, the School of the Museum of Fine Arts at Tufts University, Massachusetts College of Art and Design, University of Massachusetts Dartmouth, and Lesley ...
- What Ever Happened To “The Visual Microphone?”on May 2, 2022 at 5:00 pm
As noted in the study, reconstructing audio from video requires that the frequency of the video samples – i.e., the number of frames of video captured per second — be higher than the frequency of the ...
- Fossil of Sick Pterosaur Crest Reveals Clues to Feather Coloron April 20, 2022 at 8:00 am
Experience Next-Level Sound Spatial audio with dynamic head ... like the feathers of modern birds. An artist’s reconstruction of T. imperator. The preserved feathers are in two tones on the ...
- Faceunity Technology Upgraded Its AR SDK, Expanding the Gameplay for Interactions in Live Streamingon April 13, 2022 at 3:23 am
The algorithm capabilities include facial landmarks, facial expression & action detection, 3D face model reconstruction ... other products from dozens of audio/video manufacturers in the industry ...
- IIT-Jodhpur Researchers Develop Software Framework For Converting Digital Comics To Videoon January 4, 2022 at 8:53 am
The researchers of Digital Humanities contributes to a composite of approaches methods rather than different approaches, that lay emphasis on preserving, reconstructing, transmitting, and ...
via Bing News