Tel.

+86
-10-62660053

E-mail

contact@speechocean.com

Leave Info.
TOP
Southeast Asian Databases Overview
2020.03.10

Speechocean: AI Data Resource and Data Service Provider


In recent years, the economy of Southeast Asian countries has achieved rapid development. Therefore, amount of AI technologies are implemented in Southeast Asian and drives the demand for more speech corpus.


In addition to providing one-stop data solution, Speechocean also has 13,000 hours off-the-shelf Southeast Asian speech corpora and 3 Southeast Asia pronunciation lexica can be licensed.


Please check the forms below:


ASR Corpus

Language

Speakers

Total Hours

Thai

1,216

3,463

Indonesian

1,069

2,800

Malay

1,726

2,075

Vietnamese

1,070

1,264

Urdu

583

1,148

Tagalog

257

507

Singapore English

404

710

Filipino English

207

326

Filipino American English

100

172

Vietnamese American English

100

194

Pakistani American English

100

199 


Lexicon

Language

Entries

Urdu

101,211

Vietnamese

104,088

Malay

101,935


Speechocean always devoted itself to providing specialized engineering data products and services to enterprises and scientific research institutions in the whole industry chain of AI. Our business involves various domains such as speech recognition, speech synthesis, computer vision, lexicon, and natural language processing and provides relevant services for the design, collection, transcription, annotation, etc. of data.

 

If you have any further inquiries, please do not hesitate to contact us.

Email: contact@speechocean.com


Telephone
Leave Information
Member