Welcome to the South African Centre for Digital Language Resources website
Log In

Log In

Forgot Your Password?

Tray Subtotal: R0.00

Lwazi Sepedi TTS corpus

Be the first to review this resource

Availability: Available for download

R0.00

Quick Overview

Orthographic and phonemically aligned transcriptions

Lwazi Sepedi TTS corpus

Double click on above image to view full picture

Zoom Out
Zoom In
R0.00
Orthographic and phonemically aligned transcriptions

Write Your Own Review

Only registered users can write reviews. Please, log in or register

Additional Information

Contact persons and email addresses Karen Calteaux: KCalteaux@csir.co.za
Affiliations Meraka Institute, CSIR
Licensing Creative Commons Attribution Non-Commercial No-Derivatives 3.0 South Africa license
Licensing details http://creativecommons.org/licenses/by-nc-nd/3.0/za/.
Names of principal developers Daniel van Niekerk, Etienne Barnard, Marelie Davel, Aby Louw, Alta de Waal
Media type Speech
ISLRN 045-689-479-254-9
Category Monolingual speech corpora: Annotated
Annotation details Orthographic and phonemically aligned transcriptions
Citation information Badenhorst, Jaco, Charl van Heerden, Marelie Davel, and Etienne Barnard. "Collecting and evaluating speech recognition corpora for 11 South African languages." Language resources and evaluation 45, no. 3 (2011): 289-309 - www.meraka.org.za/lwazi/publications/badenhorst09collecting.pdf?
Description of background and purpose TTS corpus for standard SA dialect. This corpus was created to enable the building of a TTS voice.
Distribution Downloadable
Source Books, Government Documents, Periodicals, Web
Stratum (structure of data) Phonetically balanced sentences were chosen to ensure adequate phone coverage, from the reference texts
Size (number of tokens/duration) 26 mins
File size 52Mb (unzipped)
Specialised software required N/A
Maturity Released
Verification and proof of quality Manually verified by language expert.
Compatibility with standards Some well-defined guidelines (in-house/external)
Details of documentation available

- Badenhorst, Jaco, Charl van Heerden, Marelie Davel, and Etienne Barnard. "Collecting and evaluating speech recognition corpora for 11 South African languages." Language resources and evaluation 45, no. 3 (2011): 289-309 - www.meraka.org.za/lwazi/publications/badenhorst09collecting.pdf?

- Lwazi Project Final Report "Development of a telephone-based speech-driven information service for the South African Government" - http://www.meraka.org.za/lwazi/publications.php

- C. van Heerden, E. Barnard and M. Davel, "Basic Speech Recognition for Spoken Dialogues," in Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech), Brighton, United Kingdom, September 2009, pp. 3003-3006.

Standards compliance details No
Contributors No

Resource Tags

Use spaces to separate tags. Use single quotes (') for phrases.