Welcome to the South African Centre for Digital Language Resources website
Log In

Log In

Forgot Your Password?

Tray Subtotal: R0.00

Lwazi Afrikaans ASR corpus

Be the first to review this resource

Availability: Available for download

R0.00

Quick Overview

Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

Lwazi Afrikaans ASR corpus

Double click on above image to view full picture

Zoom Out
Zoom In
R0.00
Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.

Write Your Own Review

Only registered users can write reviews. Please, log in or register

Additional Information

Contact persons and email addresses Jaco Badenhorst: jbadenhorst@csir.co.za
Affiliations Meraka Institute, CSIR
Licensing Creative Commons Attribution 2.5 South Africa License
Licensing details http://creativecommons.org/licenses/by/2.5/za/legalcode
Names of principal developers Charl van Heerden, Etienne Barnard, Jaco Badenhorst, Marelie Davel
Media type Speech
ISLRN 684-473-782-014-1
Category Monolingual speech corpora: Annotated
Annotation details Orthographic transcription
Citation information E. Barnard, M. Davel and C. van Heerden, "ASR Corpus Design for Resource-Scarce Languages," in Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech), Brighton, United Kingdom, September 2009, pp. 2847-2850.
Description of background and purpose Complete audio recordings and orthographic transcriptions used for Lwazi speech recognition systems.
Distribution Downloadable
Source Books, Government Documents, Periodicals, Web
Stratum (structure of data) 200 speakers (~14 elicited utterances, ~16 phonetically balanced read sentences).
Size (number of tokens/duration) '04:16:55
File size 273 Mb (unzipped)
Specialised software required N/A
Maturity Released
Verification and proof of quality Manually verified by language expert.
Compatibility with standards Some well-defined guidelines (in-house/external)
Details of documentation available

- Badenhorst, Jaco, Charl van Heerden, Marelie Davel, and Etienne Barnard. "Collecting and evaluating speech recognition corpora for 11 South African languages." Language resources and evaluation 45, no. 3 (2011): 289-309 - www.meraka.org.za/lwazi/publications/badenhorst09collecting.pdf?

- Lwazi Project Final Report "Development of a telephone-based speech-driven information service for the South African Government" - http://www.meraka.org.za/lwazi/publications.php

- C. van Heerden, E. Barnard and M. Davel, "Basic Speech Recognition for Spoken Dialogues," in Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech), Brighton, United Kingdom, September 2009, pp. 3003-3006.

Standards compliance details No
Contributors No

Resource Tags

Use spaces to separate tags. Use single quotes (') for phrases.