W prezentowanej pracy przeprowadzono analizę możliwości wykorzystania ontologii dla usprawnienia analizy semantycznej zdań sformułowanych w języku polskim, traktowanych jako zapytania do bazy danych. Skoncentrowano się na zagadnieniach opisu semantyki zapytań oraz semantyki bazy danych na podstawie logiki opisowej (ang. Description Logic).
EN
The feasibility analysis of ontology incorporation into queries semantic analysis process, where queries to databases are formulated in natural polish language are contained in this paper. In presented work, the semantics of queries and the database was formalized with use of Description Logic.
Obecna sytuacja językowa świata prezentuje się wręcz katastrofalnie, gdyż wiele małych języków jest poważnie zagrożonych całkowitym wymarciem i to niestety już w najbliższych latach. Taki stan rzeczy nie jest bynajmniej korzystny z punktu widzeniawysoce pożądanego zachowaniaróżnorodności dziedzictwa kulturowego ludzkości. Ponadto pamiętać trzeba, że każdy z będących w użyciu języków ludzkich to także nieodzowny materiał badawczy dla lingwistów, w oparciu o który rozwazani naukowcy mogą testować swoje hipotezy dotyczące nie poznanej do końca jeszcze natury języka ludzkieo, jego pochodzenia, tzw. uniwersaliów językowych czy też ewolucji całych grup i rodzin językowych. W związku z powyższym podejmowane sa w świecie nauki liczne inicjatywy mające na celu uratowanie przed popadnięciem w niepamięć licznych języków ludzkich o małej liczbie użytkowników. W Artykule autorzy wysunęli oryginalną propozycje budowy odpowiednich narzędzi informatycznych w postaci generatorów struktur syntaktycznych, które mogłyby znacznie usprawnić proces rekonstrukcji i rewitalizacji zagrożonych wymarciem języków, a nawet i języków od wielu lat całkowicie już martwych. Obecnie przedmiotem zainteresowań autorów są peryferyjne skandynawskie języki północnogermańskie, takie jak potencjalnie zagrożony wymarciem język farerski (Wyspy Owcze) i niestety od ponad stu lat martwy już język norn ( Szetlandy i Orkady)
EN
Nowadays the situation of the world of languages is such that many small languages are seriously in the coming years. This situation is not good from the point of view of the necessity of preserving the diversity of cultural heritage . Moreover, every human language constitute the precious research material for the linguists, basing on which they can test their scientific hypothesis concerning the nature of the human language , its origin, and the evolution of groups and families of languages. The world of sciences developed many initiatives and projects the aim of which is to preserve the small languages from total perishing. In the paper we propose to build proper computer tools in the form of the generators of syntactic structures that could support the process of reconstruction and revitalization of endagered and even dead languages. The point of author's interest are peripheral Scandinavian languages, such as potentially endangered language of the Faroe Islands and the norn language, which is unfortunately dead for over one hundreed years .
For several years now, computational linguistics has been addressing the problems of and developing technological tools for automatic translation, with its important economic implications. At the same time, projects dedicated to facilitating translations of ancient works, which are often fraught with considerable hermeneutical difficulties, are far rarer. The PTTB system, which was designed and constructed at the Institute for Computational Linguistics (National Research Council) in Pisa, enables a group of about fifty scholars to translate the entire Babylonian Talmud, written in Aramaic and Biblical Hebrew, more quickly and uniformly. While the language and structure of the textual corpus made the development of machine translation algorithms impossible, translation memory and edit distance techniques have produced excellent results. Based on them, the system offers scholars a high percentage of correct translations, accessible through a very intuitive graphic user interface. The results are easily exportable to xml files suitable for the final editing and printing operations. So far, these innovations have made it possible to publish four treatises in six printed volumes with translations, annotations and thematic indexes within a relatively short time. Several other volumes have already been processed and are currently being edited. Various perspectives open up for the use of the digital Talmud in Italian. One of the most interesting options involves using machine learning and named entity recognition techniques to associate semantic or conceptual values (Talmud Ontological Framework) with and make cross-references among portions of the text that report or discuss similar themes. This will help various groups of (general and specialised) users to browse this vast and heterogeneous textual archive on the semantic basis. The strategy adopted here is also aligned with the Dictionnaire des Termes Médico-botaniques de l’Ancien Occitan (DiTMAO), another ongoing lexicographical project. It will enable users to semantically navigate within an extensive medical-pharmaceutical and botanical textual corpus in medieval Occitan. For these reasons, PTTB and DiTMAO can be regarded as two instances of one innovative technological infrastructure for linguistic and philological research in the field of digital humanities.
IT
La linguistica computazionale affronta da molti anni e da parte di molti enti pubblici e privati i problemi posti dalla traduzione automatica che ha importanti ricadute applicative ed economiche. Molto più rari sono invece i casi in cui comunità scientifiche e/o soggetti industriali a livello internazionale investano risorse per rendere più semplice e veloce il lavoro di chi affronta opere antiche, spesso caratterizzate da grandi difficoltà interpretative. Il sistema TRADUCO, progettato e realizzato presso l’Istituto di Linguistica Computazionale del CNR di Pisa, consente a un gruppo di talmudisti di rendere in italiano corrente i trattati del "Talmud" babilonese, redatti in aramaico ed ebraico biblico. La lingua e la struttura del testo hanno reso improponibile la progettazione di algoritmi di "Machine Translation", mentre ottimi risultati si sono ottenuti grazie a tecniche di "Translation Memory" e di "Edit Distance". Queste, ben armonizzate fra loro, consentono al sistema di proporre agli specialisti una sempre più alta percentuale di traduzioni corrette, inserite in un ambiente di lavoro intuitivo. Il risultato è esportabile in file xml predisposti per le operazioni finali di stampa. Ciò ha consentito di pubblicare già 5 trattati in volumi cartacei che offrono testo tradotto, annotazioni, indici tematici, e altre informazioni. Molti volumi sono già stati tradotti e attualmente in fase di controllo editoriale. Varie prospettive si aprono, infine, per la fruizione del "Talmud" digitale in italiano. Fra esse, una fra le più interessanti riguarda la possibilità di associare, anche mediante tecniche di "Machine Learning" e "Named Entity Recognition", valori semantici o concettuali ("Talmud Ontological Framework") a porzioni di testo che riferiscono o discutono tematiche simili. Ciò consentirà di navigare su base semantica un archivio testuale tanto vasto ed eterogeneo. La strategia adottata risulta modulabile anche per altri progetti di carattere lessicografico come, per esempio, il DiTMAO ("Dictionnaire des Termes Médico-botaniques de l’Ancien Occitan"). Esso offrirà percorsi di navigazione semanticamente orientati nell’àmbito di un vasto corpus di testi digitalizzati di argomento medico-farmaceutico e botanico in occitanico medievale. Per tali ragioni TRADUCO e DiTMAO si configurano come istanze di un’infrastruttura tecnologica di linguistica e filologia computazionali fra le più innovative nel settore delle "Digital Humanities".
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.