+86 -10-62660053
contact@speechocean.com
Speechocean: AI Data Resource and Data Service Provider
Till now, Speechocean has 13,200 hours off-the-shelf Southeast Asian speech corpora and 4 Southeast Asia pronunciation lexica can be licensed.
Please check the forms below:
Speech Recognition Corpus | ||
Language | Speakers | Total Hours |
Indonesian | 1,063 | 2,769 |
Malay | 1,726 | 2,075 |
Tagalog | 257 | 424 |
Thai | 1,216 | 3,463 |
Urdu | 583 | 1,148 |
Vietnamese | 1,446 | 1,595 |
Singapore English | 404 | 710 |
Filipino English | 207 | 326 |
Indonesian English | 402 | 126 |
Pakistani American English | 100 | 199 |
Filipino American English | 100 | 172 |
Vietnamese American English | 100 | 194 |
Lexicon | |
Language | Entries |
Thai | 114,797 |
Malay | 101,799 |
Vietnamese | 104,088 |
Urdu | 101,211 |
Speechocean always devoted itself to providing engineering data products and services to enterprises and scientific research institutions in the whole industry chain of AI. Our business involves various domains such as speech recognition, speech synthesis, computer vision, lexicon, and natural language processing and provides relevant services for the design, collection, transcription, annotation, etc. of data.
If you have any further inquiries, please do not hesitate to contact us.
Email: marketing@speechocean.com