Nvidia Unveils AI Music Tool Capable of Unique Soundscapes
Nvidia introduces Fugatto, an AI-powered music editor that can create unprecedented soundscapes, such as a meowing trumpet. Fugatto uses text and audio inputs for innovative music creation, even altering voices and sounds like never before.
Nvidia has launched Fugatto, an AI-based music editor designed to generate sounds that defy traditional boundaries, like trumpets that meow. Fugatto can compose music, sounds, and speech from text and audio prompts, carrying the ability to produce unprecedented combinations.
An example demonstrated in the embedded video showcases Fugatto’s output from imaginative suggestions, including ‘Create a saxophone howling, barking then electronic music with dogs barking.’ Nvidia also illustrates its capability to craft unique soundscapes described through phrases like ‘Deep, rumbling bass pulses paired with intermittent, high-pitched digital chirps, reminiscent of a giant sentient machine awakening.’
Besides crafting original sounds, Fugatto can modify the tonal qualities and accents of voices, transforming them to sound angrier or calmer. It even enables music editing by isolating vocals, enhancing tracks with new instruments, and adjusting melodies, such as swapping a piano line with an opera singer.
The accompanying research paper lists the diverse datasets that trained Fugatto, including sound effect libraries from the BBC. While other AI audio solutions exist, such as those from Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe, Nvidia’s offering is unique in its claim to generate entirely novel sounds.
In developing Fugatto, Nvidia curated a dataset comprising millions of audio samples and formulated comprehensive instructions to expand model capabilities and precision without additional data requirements. The availability timeline for Fugatto remains undisclosed.