Diacritic Restoration is a necessity in the processing of languages with Latinbased scripts that utilizes letters outside the basic Latin alphabet used by English language. Yorùbá is one such languages, marking underdot (dot-below)on three characters and tone marks on all seven vowels and two syllabic nasals. The problem of restoring underdotted characters has been fairly addressed using character as linguistic units for restoration. However, the existing characterbased approaches and word-based approach has not been able to sufficiently address restoration of tone marks in Yorùbá. We address in this study tone marks restoration as a subset of diacritic restoration. We proposed using the syllable (derived from word) as the linguistic token for tone marks restoration. In our experimental setup, we used Yoruba text collected from various sources as data with total word count of 250,336 words. These words, on syllabification, yielded 464,274 syllables. The syllables were divided into training and testing data in different proportions ranging from 99% used for training and 1% used for testing to 70% used for training and 30% used for testing. The aim of evaluation different proportions was to determine how the ratio of training-to-test data affect the variations that may occur in the result. We applied Memory-based learning to train the models. We also set up a similar experiment using character token to be able to compare the performance. The result showed that using syllable was able to increase accuracy at word level to 96.23% and an average of almost 15% over that gotten from using character. We also found out that using 75% of data for training and the remaining 25% for testing gives the results with the least variation in a ten-fold cross validation test. Hybridizing this method that uses syllabless as processing linguistic units with other methods like lexicon lookup might likely lead to improvement over the current result.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
I illustrate the historical roots of the theory which I called later “Asymptotic Representation Theory” – the theory which can be considered as a part functional analysis, representation theory, and more general – probability theory, asymptotic combinatorics, the theory of random matrices, dynamics, etc. The first and very concrete example is a remarkable (and forgotten) paper by J. von Neumann, which I try here to connect with the modern theory of random matrices; the second example is a quote of an important thought of H.Weyl about the theory of symmetric groups. In the last section I give a short review of the ideas of the asymptotic representation theory, which was developed starting from the 1970s, and now became very popular. I mention several important problems, and give a list (incomplete) of references. But the reader must remember that this is just a synopsis of the “baby talk”.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Let M be a fixed positive and N run through products of two odd primes, (M, N)=1. The quadratic residues mod N in the interval (0, N) are shown to be asymptotically uniformly distributed in residue classes mod M.
PL
Niech M ustalona liczba naturalna, a N przebiega iloczyny dwóch liczb pierwszych nieparzystych, (M, N)=1. Dowodzi się, że reszty kwdratowe mod N w przedziale (0, N) są asymptotycznie rozmieszczone równomiernie w klasach reszt mod M.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.