Commercial and research corpus of various languages ​​"Speechocean corpus"

■ This is an article posted on June 2016, 11, so the content of the information may be out of date.

On the Unipos website, commercial and research corpora in various languages Speechocean corpus Page has been added.

Chugoku Speechocean We have a large number of corpora such as ASR-Corpus (automatic speech recognition corpus), TTS-Corpus (speech synthesis corpus), Text-Corpus (text corpus), about 500 commercial types and about 250 types for research.

There are more than 110 languages ​​and dialects (accents), ages, genders, recording times, recording platforms, etc. are finely classified, so when inquiring, please inform us of the name of your desired corpus and SN (King-). Please give me.

KingLine Data Center (Manufacturer site)

[An example of a corpus with a track record of handling at Unipos]

-King-ASR-090
US English Speech Recognition Corpus-Complex (Desktop) -50 Speakers
Recording hours: 49.8 hours

-King-ASR-139
US English Speech Recognition Corpus-Sentence (Mobile) -150 Speakers
Recording hours: 98 hours

-King-ASR-213
US English Speech Recognition Corpus-SMS / Sentence (Desktop) -200 Speakers
Recording hours: 164.53 hours

-King-ASR-050
Japan English Speech Recognition Corpus-Sentence (Desktop) -201 Speakers
Recording hours: 382.5 hours