In this appendix, we provide detailed information about the questionnaire given to participants for the subjective evaluation. The questionnaire was accessible online here. Originally written in Spanish, we include an English translation of each section of the questionnaire in the following pages. This translation aims to present the content accurately to English-speaking audiences and ensure a broader understanding of the subjective assessment process used in our study. The questionnaire’s structure and questions are meticulously replicated in English, retaining the essence and purpose of the original Spanish version. The listening questions are available online here.
Welcome to our subjective test on intelligibility and the voice synthesizer’s imitation capability compared to the human voice. Thank you for taking the time to participate.
The aim of this evaluation is to assess the automatic control of an articulatory synthesizer called Pink Trombone. The Pink Trombone is a tool that can mimic human speech by automatically controlling the movement of the tongue, lips, vocal cords, and other articulators.
Before you begin, take a moment to familiarize yourself with the Pink Trombone by visiting this link: https://www.dood.al/pinktrombone/
In this test, we seek to verify the accuracy with which the Pink Trombone can reproduce human vocalic sounds that can be correctly interpreted, as well as imitate a certain human speech recording. Please note the following details:
If you have any questions, you can contact us at mateo.camara@upm.es
Evaluate the ability to interpret a vowel.
For the following sounds you should rate the resemblance to the vowel indicated.
Rate from 0 (I do not interpret that vowel at all) to 100 (it is perfectly interpreted).
Available online.
For the following sounds, evaluate the resemblance to a set of sounds indicated.
For the following sounds you will be asked how well you appreciate a sequence of vowels.
Rate from 0 (I do not interpret that sequence at all) to 100 (it is perfectly interpreted).
Available online.
In the following questions you will be asked to evaluate the imitation ability of synthetic sounds with respect to a human reference sound. Note that the sounds can NOT be the same. You are asked to assess coherence, defined as if a person is trying to imitate another person, keeping in mind that you cannot change your own vocal cords/throat/tongue etc….
Evaluate the ability to imitate the sound with respect to a human reference. Listen to the reference first.
Rate from 0 (imitation is very poor) to 100 (imitation is very credible).
Available online.