The speech data themselves are usually no subject of special concern about data protection once the speaker has agreed to waive his/her rights to the recorded data2.2. However, this might not be the case for biometric databases. A biometric speech corpus is the special case of a speech corpus designed with the aim of developing and/or evaluating systems of voice authentication. In some cases the data provided by the speaker might be abused to break into future security systems based on the new technology2.3. Although this is rather unlikely, we recommend to take extra care in these cases that the mapping between personal speaker data and the speaker ID2.4 within the corpus is inaccessible for everybody including former staff of the speech corpus production project.
Meta data about the speakers, that is personal data like home address, telephone numbers, email etc. are always to be protected and in most countries subject to special laws. Please contact your legal advisors about how to properly store and protect these kind of data, if you decide to collect them.