Generate Audio for Video
Choose a model and upload a video to generate synchronized audio.
| Model | Best for | Avoid for |
|---|---|---|
| TARO | Natural, physics-driven impacts — footsteps, collisions, water, wind, crackling fire. Excels when the sound is tightly coupled to visible motion without needing a text description. | Dialogue, music, or complex layered soundscapes where semantic context matters. |
| MMAudio | Mixed scenes where you want both visual grounding and semantic control via a text prompt — e.g. a busy street scene where you want to emphasize the rain rather than the traffic. Great for ambient textures and nuanced sound design. | Pure impact/foley shots where TARO's motion-coupling would be sharper, or cinematic music beds. |
| HunyuanFoley | Cinematic foley requiring high fidelity and explicit creative direction — dramatic SFX, layered environmental design, or any scene where you have a clear written description of the desired sound palette. | Quick one-shot clips where you don't want to write a prompt, or raw impact sounds where timing precision matters more than richness. |
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.
Generate audio to see waveform.