Tel.

+86
-10-62660053

E-mail

contact@speechocean.com

Leave Info.
TOP
Southeast Asian Databases Overview
2020.07.13

Speechocean: AI Data Resource and Data Service Provider 


Till now, speechocean has 13,200 hours off-the-shelf Southeast Asian speech corpora and 4 Southeast Asia pronunciation lexica can be licensed. 


Please check the forms below:


Speech   Recognition Corpus

Language

Speakers

Total   Hours

Indonesian

1,063

2,769

Malay

1,726

2,075

Tagalog

257

424

Thai

1,216

3,463

Urdu

583

1,148

Vietnamese

1,446

1,595

Singapore   English

404

710

Filipino   English

207

326

Indonesian   English

402

126

Pakistani   American English

100

199

Filipino   American English

100

172

Vietnamese   American English

100

194



Lexicon

Language

Entries

Thai

114,797

Malay

101,799

Vietnamese

104,088

Urdu

101,211


Speechocean always devoted itself to providing engineering data products and services to enterprises and scientific research institutions in the whole industry chain of AI. Our business involves various domains such as speech recognition, speech synthesis, computer vision, lexicon, and natural language processing and provides relevant services for the design, collection, transcription, annotation, etc. of data.


If you have any further inquiries, please do not hesitate to contact us.

Email: marketing@speechocean.com


Telephone
Leave Information
Member