Tel.

+86
-18210599182

E-mail

contact@dataoceanai.com

Leave Info.
TOP

English Speech Databases Overview

2020.08.13

Speechocean: AI Data Resource and Data Service Provider

 

Till now, Speechocean has launched multiple off-the-shelf English databases that can be licensed, including:

 

-- 30,000+ hours speech recognition corpora

-- 124 hours speech synthesis corpora

-- 5 English lexica with 800,000+ entries in total

 

Please check the details below:

 

Speech   Recognition Corpus

Language

 Speakers

 Hours

American English

8,747

8,151.00

Chinese   (Mainland) English

8,313

5,607.00

Indian English

2,607

3,897.00

British   English

2,681

3,522.00

Australian   English

1,285

1,951.00

Canadian   English

1,607

1,309.00

Japanese   English

1,005

902.00

Chinese (Hong   Kong) English

412

821.00

Singapore   English

404

710.00

Russian   English

230

492.00

Romanian   English

201

389.29

French English

225

378.00

Indonesian   English

804

378.00

South African   English

200

359.20

Nigerian   English

206

357.40

Portugal   English

201

341.10

Filipino   English

207

326.40

Spain English

200

326.00

New Zealand   English

200

311.00

German English

196

306.00

Irish English

204

302.90

Korean English

116

206.70

 

Speech   Synthesis Corpus

Language

Gender

 Hours

American   English

Male

34.88

American   English

Female

54.57

British   English

Male

10.45

British   English

Female

24.37

 

Lexicon

Language

Entries

American English   (person and location name)

             351,621

American English   (loan words)

             190,328

American English   (POS)

             104,177

British English

             100,898

Indian English

             106,493


 Email: marketing@speechocean.com


Telephone
Leave Information
Member