Speechocean, as a well-known data resources and data services supplier, devotes itself to providing qualified databases and efficient
services for its academic and industrial customers and help them create diversified values in the fields of Human Computer Interaction and Human Language Technology.
Speechocean is capable of providing various types of large databases and data-related services in many languages and accents such
as data designing, collecting, transcribing, annotating, validating and linguistic services and other related processing services for many
technical fields such as speech synthesis, speech recognition, machine translation, web search, nature language understanding, image
recognition and etc.
According to customer’s specific requirement, we can also provide One-Stop Data Solution, which includes services of data design,
data collection, data processing, modeling & model training, testing and evaluation. Please click "One-Stop Data Solution" for details.
During past 15 years, Speechocean has been providing its customers over 1000 databases and various types of customized services
covering 110 languages and accents. Till now, Speechocean has established ten overseas offices equipped with experienced
international teams and sophisticated project management process in Hong Kong, UK, Germany, Spain, Canada, Russia, and
other countries and regions.
Based on its unique features of large-scale data solution capability with guaranty of high quality, cost optimization and fast delivery,
Speechocean won great reputation and trust and established long-term cooperation with diversified customers around the world.
Meanwhile, Speechocean is one of the world largest language resource providers. By the end of 2014, KingLine Data Center (operated by Speechocean) has nearly 500 large-scale off-the-shell corpora, covering 110+ languages/ accents and 70+ regions around the
world, could be authorized to customers. All these corpora, with perfection of independent intellectual right and multiple layers of
transcription and annotation, can meet customer’s diversified requirement for modeling and model training in Human Computer
Interaction field.(Please click "KingLine Data Center-Commercial Resources" for data list).
KingLine Data Center also has hundreds of high-quality academic resources to satisfy the experimental and testing needs of scientific
research institutions, colleges and individuals around the world. All these corpora could be provided with minimized cost which is far
beyond the actual value. We also welcome members to share and exchange data with us.(Please click "KingLine Data Center-Academic Resources" for data list).
Kingline Data Center, Language Resources, Speech Synthesis Datasets, Speech Recognition Datasets
Speech Data Transcription/ Annotation
Image Data Collection
Image Data Labelling
Multi-Language Linguistic Lexicon