Gĩkũyũ
Unsupervised Learning of Morphology and the Languages of the World
Submitted by Guy on Mon, 2010-03-01 14:12Unsupervised Learning of Morphology and the Languages of the World,
, Department of Computer Science and Engineering, Gothenburg, (2009)
»
- Login or register to post comments
- Google Scholar
Automatic Diacritic Restoration for African Languages
Submitted by Guy on Tue, 2007-10-23 12:16The orthography of many African languages includes diacritically marked characters. Falling outside the scope of the standard Latin encoding, these characters are often represented in digital language resources as their unmarked equivalents. This renders corpus compilation more difficult, as these languages typically do not have the benefit of large electronic dictionaries to perform diacritic restoration.
Gilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be
Peter Waiganjo Wagacha: School of Computing and Informatics, University of Nairobi, Nairobi, Kenya, waiganjo [at] uonbi [dot] ac [dot] ke
This is a demonstration system for a diacritic restoration method that is able to automatically restore diacritics on the basis of local graphemic context. It is based on the machine learning method of Memory-Based learning. We have applied the method to the African languages of Cilubà, Gĩkũyũ, Kĩkamba, Maa, Sesotho sa Leboa, Tshivenḓa and Yoruba.
You can find more information on this system in this paper
Authors:
Guy De Pauw: CNTS - Language Technology Group, University of Antwerp, Antwerp, Belgium, guy [dot] depauw [at] ua [dot] ac [dot] beGilles-Maurice de Schryver: African Languages and Cultures, Ghent University, Ghent, Belgium, gillesmaurice [dot] deschryver [at] ugent [dot] be
Peter Waiganjo Wagacha: School of Computing and Informatics, University of Nairobi, Nairobi, Kenya, waiganjo [at] uonbi [dot] ac [dot] ke
Gĩkũyũ Diacritic Placement - Demo
Submitted by Guy on Tue, 2006-12-12 14:52The orthography of Gĩkũyũ includes a number of accented characters to represent the entire vowel system (namely ĩ and ũ). Not available on standard computer keyboards, these characters are usually typed as the nearest available characters (i and u).
A grapheme-based approach to accent restoration in Gĩkũyũ
Submitted by Guy on Tue, 2006-12-12 14:41A grapheme-based approach to accent restoration in Gĩkũyũ,
, Proceedings of the Fifth International Conference on Language Resources and Evaluation, May, 2006, Genoa, Italy, p.1937-1940, (2006)
Development of a corpus for Gĩkũyũ using machine learning techniques
Submitted by Guy on Tue, 2006-12-12 14:41Development of a corpus for Gĩkũyũ using machine learning techniques,
, Proceedings of LREC workshop - Networking the development of language resources for African languages, Genoa, Italy, (2006)
