Neural Audio Synthesis generates sound through generative neural networks. Sound effects are auditory elements that complement a scene (cinema, fiction, videogames), support a storyline, enhance a fictional environment, or improve perceived plausibility and presence (including Virtual Reality) without being music or dialog. This manuscript presents a quantitative literature review at the intersection of these domains: the neural generation of sound effects. Using large language models, we ran an extensive, systematic survey of the major scientific repositories, filtering the most relevant articles for a thorough analysis. We examine the generation paradigms used in sound synthesis, the specific types of sound effects created, the datasets used, and the evaluation metrics considered, and we discuss the field's evolution toward multimodal approaches where sound generation integrates with other sensory modalities. All supporting materials and code are available online.
A quantitative scope review of neural audio synthesis for sound-effect generation,
published in IEEE TASLP. The review was built with an LLM-assisted survey pipeline, and
its supporting materials are open.
Code & supporting materials
The LLM-assisted survey pipeline and supporting materials:
@article{camara2025neural,title={Neural Audio Synthesis for Sound Effects: A Scope Review},author={C{\'a}mara, Mateo and Marcos, Fernando and Bargum, Anders R and Erkut, Cumhur and Reiss, Joshua and Blanco, Jos{\'e} Luis},journal={IEEE Transactions on Audio, Speech and Language Processing},year={2025},publisher={IEEE}}