Tech

Google DeepMind unveils V2A, a new AI model that can generate soundtrack and dialogue for videos

Published

5 months ago

June 18, 2024

Admin

Google DeepMind unveils V2A, a new AI model that can generate soundtrack and dialogue for videos

Google DeepMind, the company’s AI research lab recently unveiled V2A, a new model that can generate audio from videos.

Google DeepMind V2A | Video to audio AI | Deepmind Google has no plans to make V2A available to the public anytime soon. (Image Source: Google)

Video generation models like Sora, Dream Machine, Veo and Kling are advancing at a rapid pace, allowing users to generate videos from text prompts. But, the majority of these systems are limited to silent videos. Google DeepMind seems to be aware of the problem and is now working on a new large language model that can generate soundtracks and dialogues for videos.

In a blog post, the tech giant’s AI research lab unveiled V2A (Video-to-audio), a new work-in-progress AI model that “combines video pixels with natural language text prompts to generate rich soundscapes for the on-screen action.”

=metering.exceededMeter.max AND metering.userProperties.premium=’false’ )” amp-access-hide>

Compatible with Veo, a text-to-video model the company introduced at the recently concluded Gooogle I/O 2024, V2A can be used to add dramatic music, realistic sound effects and dialogue that matches the tone of the video. Google says the new large language model also works with “traditional footage” like silent films and archival material.

The new V2A model can generate an “unlimited number of soundtracks” for any video and features an optional ‘positive prompt’ and ‘negative prompt’, which can be used to tune the output to your preferences. It also watermarks the generated audio with SynthID technology.

DeepMind’s V2A technology takes the description of a sound as input and uses a diffusion model trained on a combination of sounds, dialogue transcripts and videos. Since the model wasn’t trained on a lot of videos, the output can be distorted at times. Google also says it won’t release V2A to the public to prevent misuse anytime soon.

First uploaded on: 18-06-2024 at 17:10 IST

Up Next

CERN Digital Storage Demand Will Soar With More Intense Physics Experiments

Don't Miss

Apple’s Phone App Finally Supports T9 Dialing in iOS 18

The Herald News Today

Google DeepMind unveils V2A, a new AI model that can generate soundtrack and dialogue for videos

Tech

Google DeepMind unveils V2A, a new AI model that can generate soundtrack and dialogue for videos

Google DeepMind, the company’s AI research lab recently unveiled V2A, a new model that can generate audio from videos.

From Stadium to Screen: How Technology is Changing Sports Viewing

20/7/2024 Horse Racing Tips and Best Bets – Flemington, Flemington Cup day

Indian tech hub Karnataka state’s move to reserve jobs for locals not finalised, chief minister says

Financial picture dramatically improves for Shamrock Rovers in 48 hours with Sinclair Armstrong deal and European win

Our fashion editor’s favourite affordable bag is 20% off today

Jay Shah’s Big Decision Hovers Over Cricket’s Associate Member Directors Election

Budget shampoo that adds MAJOR volume boost is on sale for Amazon Prime Day: ‘Your hair feels fuller and thicker’

Being active on your commute lowers risk of disease and mental health

Tiger Woods cops brutal Open schedule as full Round 1 tee times revealed

Rafael Nadal makes staggering Carlos Alcaraz prediction, makes Jannik Sinner comparison