S0063 : German SpeechDat(II) FDB 4000
The German SpeechDat(II) FDB-4000 contains the recordings of
4000 German speakers (1938 males, 2060 females, and 2 unknown-gender speakers),
recorded over the German fixed network.
The database is partitioned into 17 CD-ROMs, in the final SpeechDat(II)
database exchange format.
Speech samples are stored as sequences of 8-bit 8 kHz A-law.
Each prompted utterance is stored in a separate file. Each signal file is accompanied
by an ASCII SAM label file which contains the relevant descriptive information.
It was validated by SPEX (the Netherlands) to assess its compliance
with the SpeechDat format and content specifications.
Each speaker uttered the following items:
- 1 isolated digit
- 1 sequence of 10 isolated digits
- 4 connected digits (prompt sheet number ≥ 5, 1 telephone number –9/11
digits, 1 credit card number –15/16 digits, 1 read PIN code -6digits)
- 1 natural number
- 1 money amount
- 2 yes/no questions (spontaneous, not prompted)
- 3 dates (1 spontaneous e.g. birthday, 1 prompted date, 1 relative and general
date expression)
- 2 time phrases (1 spontaneous time of day, 1 read time phrase)
- 3 application words
- 1 word spotting phrase
- 5 directory assistance names (1 spontaneous name e.g. forename, 1 spontaneous
city name, 1 read city name from a list of 500 most frequent, 1 read company/agency
name from a list of 500 most frequent, 1 read proper name, fore- and surname
from a list of 150 names).
- 3 spellings (1 spontaneous e.g. forename, 1 directory city name, 1 real/artificial
word)
- 4 isolated words
- 9 phonetically rich sentences
The following age distribution has been obtained: 204 speakers
are under 16, 1685 speakers are between 16 and 30, 1166 speakers are between
31 and 45, 729 speakers are between 46 and 60, and 216 speakers are over 60.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
Click here to view the prices and browse other ressources belonging to this category |