Scope review

Neural Audio Synthesis for Sound Effects: A Scope Review

Mateo Cámara, Fernando Marcos, Anders R Bargum, Cumhur Erkut, Joshua Reiss, José Luis Blanco

Universidad Politécnica de Madrid

Published
IEEE Trans. Audio, Speech, Language Process. · 2025

Abstract

Neural Audio Synthesis generates sound through generative neural networks. Sound effects are auditory elements that complement a scene (cinema, fiction, videogames), support a storyline, enhance a fictional environment, or improve perceived plausibility and presence (including Virtual Reality) without being music or dialog. This manuscript presents a quantitative literature review at the intersection of these domains: the neural generation of sound effects. Using large language models, we ran an extensive, systematic survey of the major scientific repositories, filtering the most relevant articles for a thorough analysis. We examine the generation paradigms used in sound synthesis, the specific types of sound effects created, the datasets used, and the evaluation metrics considered, and we discuss the field's evolution toward multimodal approaches where sound generation integrates with other sensory modalities. All supporting materials and code are available online.

A quantitative scope review of neural audio synthesis for sound-effect generation, published in IEEE TASLP. The review was built with an LLM-assisted survey pipeline, and its supporting materials are open.

Code & supporting materials

The LLM-assisted survey pipeline and supporting materials:

github.com/MateoCamara/sota.ai →

Spot a missing paper?

If a relevant article is missing from the review, please flag it through the submission form and we’ll start a change request:

Submit a missing paper →

Cite

@article{camara2025neural,
  title   = {Neural Audio Synthesis for Sound Effects: A Scope Review},
  author  = {C{\'a}mara, Mateo and Marcos, Fernando and Bargum, Anders R and Erkut, Cumhur and Reiss, Joshua and Blanco, Jos{\'e} Luis},
  journal = {IEEE Transactions on Audio, Speech and Language Processing},
  year    = {2025},
  publisher= {IEEE}
}