A new metaphor of two-dimensional text for data-driven semantic modeling of natural language is proposed, which provides an entirely new angle on the representation of text: not only syntagmatic relations are annotated in the text, but also paradigmatic relations are made explicit by generating lexical expansions. We operationalize distributional similarity in a general framework for large corpora, and describe a new method to generate similar terms in context. Our evaluation shows that distributional similarity is able to produce high-quality lexical resources in an unsupervised and knowledge-free way, and that our highly scalable similarity measure yields better stores in a WordNet-based evaluation than previous measures for very large corpora. Evaluating on a lexical substitution task, we find that our contextualization method improves over a non-contextualized baseline across all parts of speech, and we show how the metaphor can be applied successfully to part-of-speech tagging. A number of ways to extend and improve the contextualization method within our Framework are discussed. As opposed to comparable approaches, our framework defines a model of lexical expansions in context that can generate the expansions as opposed to ranking a given list, and thus does not require existing lexical-semantic resources.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
We describe a software system that processes textual data and spoken input streams of natural language and arranges the information in a meaningful way on the screen: concepts as nodes, relations as edges. For spoken input, the software simulates conceptual awareness. A naturally spoken speech stream is converted into a word stream (speech-to-text), the most significant concepts are extracted and associated to related concepts, which have not been mentioned by the speaker(s) yet. The result is displayed on a screen as a conceptual structure.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.