<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="6.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Peter Waiganjo Wagacha</style></author><author><style face="normal" font="default" size="100%">De Pauw, Guy</style></author><author><style face="normal" font="default" size="100%">K. Getao</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">J. C. Roux</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Development of a corpus for Gĩkũyũ using machine learning techniques</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of LREC workshop - Networking the development of language resources for African languages</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2006</style></year></dates><urls><related-urls><url><style face="normal" font="default" size="100%">http://aflat.org/files/6_depauw.pdf</style></url></related-urls></urls><edition><style face="normal" font="default" size="100%">J.C. Roux</style></edition><pub-location><style face="normal" font="default" size="100%">Genoa, Italy</style></pub-location><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Networking the development of computational resources for African languages can be greatly advanced if researchers aim to develop  tools that are to a large extent language-independent and therefore reusable for other languages. In this paper we describe a particular  case study, namely the development of an annotated corpus of Gĩkũyũ, using language-independent machine learning techniques. The  general aim of our work on Gĩkũyũ is two-fold: on the one hand we wish to digitally preserve this resource-scarce language, while  on the other hand it serves as a feasibility study of using language-independent machine learning techniques for linguistic annotation  of corpora. To this end we investigate established annotation induction techniques like unsupervised learning and knowledge transfer.  These methods can provide interesting perspectives for the linguistic description of many other resource-scarce languages.</style></abstract></record></records></xml>
