Audio Intellimixer generates new sounds by sampling the latent space of a Variational Autoencoder — entirely in the browser. Launch it from the Live demo button above.
This is a beta, so expect rough edges; performance is still improving. The documented, structured code will be released to support the research community once the seed paper is published.
Watch
How it works
The app loads default encoder/decoder models, then lets you query Freesound. Retrieved sounds are converted to spectrograms and passed through the encoder (optionally one you’ve uploaded), producing a D-dimensional latent space. Two of those dimensions are plotted; clicking a point samples a new latent vector — its position read as a Euclidean blend of the source sounds — which the decoder turns back into a spectrogram and, finally, an audible, downloadable waveform.
Quick guide
The controls run down the right-hand side of the app:
- Freesound query — a keyword plus a go button to fetch sounds (default:
footstep). - X-axis variable — choose which latent dimension maps to X (default: dimension 1). The menu fills once the autoencoder has loaded.
- Y-axis variable — likewise for Y (default: dimension 2).
- Upload an autoencoder — supply your own VAE: the JSON architecture file plus the binary weight files.
- Status — a progress bar and message reporting what the app is doing.
Once the Freesound audio has loaded, click anywhere in the latent space to generate a new sound, then download it. Advanced usage is described in detail in the seed paper.
Notes
Tested in Google Chrome; other browsers are untested. If you hit problems in Chrome, check the walkthrough video and get in touch — see the contact links below.