Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 28

Liczba wyników na stronie
first rewind previous Strona / 2 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  analiza tekstu
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 2 next fast forward last
EN
Significant technological advances have determined the importance of FinTech firms worldwide; they attract substantial investment and put competitive pressure on banks providing traditional services. The development of financial innovations challenges users accustomed to classical financial solutions since trust in financial technologies requires risk assessment, which becomes increasingly complicated. The main participants shaping the attitude towards FinTech are investors, customers, regulators, technology developers, and risk managers. The paper aims to explore FinTech opportunities and challenges, as the public understands them. The authors used scientific sources and employed big data processing methods to evaluate social media users' attitudes towards the FinTech sector. The obtained results revealed that, despite of overall positive attitude, FinTech companies have to pay special attention to investment management and ensuring the security and privacy of clients’ data.
PL
Znaczące postępy technologiczne zdeterminowały znaczenie firm FinTech na całym świecie; przyciągają znaczne inwestycje i wywierają presję konkurencyjną na banki świadczące tradycyjne usługi. Rozwój innowacji finansowych stanowi wyzwanie dla użytkowników przyzwyczajonych do klasycznych rozwiązań finansowych, ponieważ zaufanie do technologii finansowych wymaga oceny ryzyka, która staje się coraz bardziej skomplikowana. Głównymi uczestnikami kształtującymi podejście do FinTech są inwestorzy, klienci, organy regulacyjne, twórcy technologii i menedżerowie ryzyka. Artykuł ma na celu zbadanie możliwości i wyzwań FinTech, tak jak rozumie je opinia publiczna. Autorzy wykorzystali źródła naukowe i zastosowali metody przetwarzania dużych zbiorów danych, aby ocenić nastawienie użytkowników mediów społecznościowych do sektora FinTech. Uzyskane wyniki pokazały, że pomimo ogólnie pozytywnego nastawienia, firmy FinTech muszą zwracać szczególną uwagę na zarządzanie inwestycjami oraz zapewnienie bezpieczeństwa i prywatności danych klientów.
EN
Fault management is an expensive process and analyzing data manually requires a lot of resources. Modern software bug tracking systems may be armed with automated bug report assignment functionality that facilitates bug classification or bug assignment to proper development group.For supporting decision systems, it would be beneficial to introduce information related to explainability. The purpose of this work is to evaluate the useof explainable artificial intelligence (XAI) in processes related to software development and bug classification based on bug reports created by either software testers or software users. The research was conducted on two different datasets. The first one is related to classification of security vs non-securitybug reports. It comes from a telecommunication company which develops software and hardware solutions for mobile operators. The second dataset contains a list of software bugs taken from an opensource project. In this dataset the task is to classify issues with one of following labels crash, memory, performance, and security. Studies on XAI-related algorithms show that there are no major differences in the results of the algorithms used when comparing them with others. Therefore, not only the users can obtain results with possible explanations or experts can verify model or its part before introducing into production, but also it does not provide degradation of accuracy. Studies showed that it could be put into practice, but it has not been done so far.
PL
Zarządzanie usterkami jest kosztownym procesem, a ręczna analiza danych wymaga znacznych zasobów. Nowoczesne systemy zarządzania usterkami w oprogramowaniu mogą być wyposażone w funkcję automatycznego przypisywania usterek, która ułatwia klasyfikację ustereklub przypisywanie usterek do właściwej grupy programistów. Dla wsparcia systemów decyzyjnych korzystne byłoby wprowadzenie informacji związanychz wytłumaczalnością. Celem tej pracy jest ocena możliwości wykorzystania wyjaśnialnej sztucznej inteligencji (XAI) w procesach związanych z tworzeniem oprogramowania i klasyfikacją usterek na podstawie raportów o usterkach tworzonych przez testerów oprogramowania lub użytkowników oprogramowania. Badania przeprowadzono na dwóch różnych zbiorach danych. Pierwszy z nich związany jest z klasyfikacją raportów o usterkach związanych z bezpieczeństwem i niezwiązanych z bezpieczeństwem. Dane te pochodzą od firmy telekomunikacyjnej, która opracowuje rozwiązania programowe i sprzętowe dla operatorów komórkowych. Drugi zestaw danych zawiera listę usterek oprogramowania pobranych z projektu opensource.W tym zestawie danych zadanie polega na sklasyfikowaniu problemów za pomocą jednej z następujących etykiet: awaria, pamięć, wydajnośći bezpieczeństwo. Badania przeprowadzone przy użyciu algorytmów związanych z XAI pokazują, że nie ma większych różnic w wynikach algorytmów stosowanych przy porównywaniu ich z innymi. Dzięki temu nie tylko użytkownicy mogą uzyskać wyniki z ewentualnymi wyjaśnieniami lub eksperci mogą zweryfikować model lub jego część przed wprowadzeniem do produkcji, ale także nie zapewnia to degradacji dokładności. Badania wykazały, że możnato zastosować w praktyce, ale do tej pory tego nie zrobiono.
3
Content available remote Using Word Embeddings for Italian Crime News Categorization
EN
Several studies have shown that the use of embeddings improves outcomes in many NLP activities, including text categorization. In this paper, we focus on how word embeddings can be used on newspaper articles about crimes to categorize them according to the type of crime they report. Our approach was tested on an Italian dataset of 15,361 crime news articles combining different Word2Vec models and exploiting supervised and unsupervised Machine Learning categorization algorithms. The tests show very promising results.
4
EN
Here we present a novel approach for automated creation of parallel New Testament corpora with cross-lingual semantic concordance based on Strong's numbers. There is a lack of available digital Biblical resources for scholars. We present two approaches to tackle the problem, a dictionary-based approach and a CRF model and a detailed evaluation on annotated and non-annotated translations. We discuss a proof-of-concept based on English and German New Testament translations. The results presented in this paper are novel and according to our knowledge unique. They present promising performance, although further research is necessary.
EN
The relationship between drug and its side effects has been outlined in two websites: Sider and WebMD. The aim of this study was to find the association between drug and its side effects. We compared the reports of typical users of a web site called: “Ask a patient” website with reported drug side effects in reference sites such as Sider and WebMD. In addition, the typical users’ comments on highly-commented drugs (Neurotic drugs, Anti-Pregnancy drugs and Gastrointestinal drugs) were analyzed, using deep learning method. To this end, typical users’ comments on drugs' side effects, during last decades, were collected from the website “Ask a patient”. Then, the data on drugs were classified based on deep learning model (HAN) and the drugs’ side effect. And the main topics of side effects for each group of drugs were identified and reported, through Sider and WebMD websites. Our model demonstrates its ability to accurately describe and label side effects in a temporal text corpus by a deep learning classifier which is shown to be an effective method to precisely discover the association between drugs and their side effects. Moreover, this model has the capability to immediately locate information in reference sites to recognize the side effect of new drugs, applicable for drug companies. This study suggests that the sensitivity of internet users and the diverse scientific findings are for the benefit of distinct detection of adverse effects of drugs, and deep learning would facilitate it.
6
Content available remote Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs
EN
Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field of query optimization, query transformation and of course in storing and retrieving large scale knowledge graphs the field of algorithmic optimization is still a major challenge and a vital factor in using graph databases. Few researchers have addressed the problem of optimizing algorithms on large scale labeled property graphs. Here, we present two optimization approaches and compare them with a naive approach of directly querying the graph database. The aim of our work is to determine limiting factors of graph databases like Neo4j and we describe a novel solution to tackle these challenges. For this, we suggest a classification schema to differ between the complexity of a problem on a graph database. We evaluate our optimization approaches on a test system containing a knowledge graph derived biomedical publication data enriched with text mining data. This dense graph has more than 71M nodes and 850M relationships. The results are very encouraging and -- depending on the problem -- we were able to show a speedup of a factor between 44 and 3839.
7
Content available remote Open IE-Triples Inference - Corpora Development and DNN Architectures
EN
Natural language inference (NLI) is a well established part of natural language understanding (NLU). This task is usually stated as a 3-way classification of sentence pairs with respect to entailment relation (entailment, neutral, contradiction). In this work, we focus on a derived task of relation inference: we propose a method of transforming a general NLI corpus to an annotated corpus for relation inference that utilizes existing NLI annotations. We subsequently introduce a novel relation inference corpus obtained from a well known SNLI corpus and provide its brief characterization. We investigate several DNN siamese architectures for this task and this particular corresponding corpus. We set several baselines including hypothesis only baseline. Our best architecture achieved 96.92% accuracy.
8
Content available remote Named Entity Recognition and Named Entity Linking on Esports Contents
EN
We built a named entity recognition/linking system on Esports News. We established an ontology for Esports-related entities, collected and annotated corpus from 80 articles on 4 different Esports titles, trained CRF and BERT-based entity recognizer, built a basic DOTA2 knowledge base and a Entity linker that links mentions to articles in Liquipedia, and an end-to-end web app which serves as a demo of this entire proof-of-conecpt system. Our system achieved an over 61% overall entity-level F1-score on the test set for the NER task.
EN
This paper describes a study on opinion analysis applied to both human to chatbot conversations, but also to human to human conversations using data coming from the banking sector. A polarity classifier SVM model applied to conversations provides insights and visualisations of the satisfaction of users at a given time and its evolution. We conducted a study on the evolution of the opinion on the conversations started with the chatbot and then transferred to a human agent. This work illustrates how opinion analysis techniques can be applied to improve the user experience of the customers but also detect topics that generate frustrations with a chatbot or with human experts.
EN
Knowledge graphs play a central role in big data integration, especially for connecting data from different domains. Bringing unstructured texts, e.g. from scientific literature, into a structured, comparable format is one of the key assets. Here, we use knowledge graphs in the biomedical domain working together with text mining based document data for knowledge extraction and retrieval from text and natural language structures. For example cause and effect models, can potentially facilitate clinical decision making or help to drive research towards precision medicine. However, the power of knowledge graphs critically depends on context information. Here we provide a novel semantic approach towards a context enriched biomedical knowledge graph utilizing data integration with linked data applied to language technologies and text mining. This graph concept can be used for graph embedding applied in different approaches, e.g with focus on topic detection, document clustering and knowledge discovery. We discuss algorithmic approaches to tackle these challenges and show results for several applications like search query finding and knowledge discovery. The presented remarkable approaches lead to valuable results on large knowledge graphs.
11
Content available remote Explorations into Deep Learning Text Architectures for Dense Image Captioning
EN
Image captioning is the process of generating a textual description that best fits the image scene. It is one of the most important tasks in computer vision and natural language processing and has the potential to improve many applications in robotics, assistive technologies, storytelling, medical imaging and more. This paper aims to analyse different encoder-decoder architectures for dense image caption generation while focusing on the text generation component. Already trained models for image feature generation are utilized with transfer learning. These features are used for describing the regions using three different models for text generation. We propose three deep learning architectures for generating one-sentence captions of Regions of Interest (RoIs). The proposed architectures reflect several ways of integrating features from images and text. The proposed models were evaluated and compared with several metrics for natural language generation.
EN
We propose methods for automatic generation of corpora that contains descriptions of diagnoses in Bulgarian and their associated codes in ICD-10-CM (International Classification of Diseases, 10th revision, Clinical Modification). The proposed approach is based on the available open data and Linked Open Data and can be easily adapted for other languages. The resulted corpora generated for the Bulgarian clinical texts consists of about 370,000 pairs of diagnoses and corresponding ICD-10 codes and is beyond the usual size that can be generated manually, moreover it was created from scratch and for a relatively short time. Further updates of the corpora are also possible whenever new open resources are available or the current ones are updated.
EN
This study utilizes citation analysis and automated topic analysis of papers published in International Conference on Agile Software Development (XP) from 2002 to 2018. We collected data from Scopus database, finding 789 XP papers. We performed topic and trend analysis with R/RStudio utilizing the text mining approach, and used MS Excel for the quantitative analysis of the data. The results show that the first five years of XP conference cover nearly 40% of papers published until now and almost 62% of the XP papers are cited at least once. Mining of XP conference paper titles and abstracts result in these hot research topics: "Coordination", "Technical Debt", "Teamwork'', "Startups" and "Agile Practices", thus strongly focusing on practical issues. The results also highlight the most influential researchers and institutions. The approach applied in this study can be extended to other software engineering venues and applied to large-scale studies.
14
Content available remote Medical prescription classification: a NLP-based approach
EN
The digitization of healthcare data has been consolidated in the last decade as a must to manage the vast amount of data generated by healthcare organizations. Carrying out this process effectively represents an enabling resource that will improve healthcare services provision, as well as on-the-edge related applications, ranging from clinical text mining to predictive modelling, survival analysis, patient similarity, genetic data analysis and many others. The application presented in this work concerns the digitization of medical prescriptions, both to provide authorization for healthcare services or to grant reimbursement for medical expenses. The proposed system first extract text from scanned medical prescription, then Natural Language Processing and machine learning techniques provide effective classification exploiting embedded terms and categories about patient/- doctor personal data, symptoms, pathology, diagnosis and suggested treatments. A REST ful Web Service is introduced, together with results of prescription classification over a set of 800K+ of diagnostic statements.
15
Content available remote Knowledge extraction and applications utilizing context data in knowledge graphs
EN
Context is widely considered for NLP and knowledge discovery since it highly influences the exact meaning of natural language. The scientific challenge is not only to extract such context data, but also to store this data for further NLP approaches. Here, we propose a multiple step knowledge graph based approach to utilize context data for NLP and knowledge expression and extraction. We introduce the graph-theoretic foundation for a general context concept within semantic networks and show a proof-of-concept based on biomedical literature and text mining. We discuss the impact of this novel approach on text analysis, various forms of text recognition and knowledge extraction and retrieval.
EN
The variety of hardware devices and the diversity of their users imposes new requirements and expectations on designers and developers of mobile applications (apps). While the Internet has enabled new forms of communication platform, online stores provide the ability to review apps. These informal online app reviews have become a viral form of electronic word-of-mouth (eWOM), covering a plethora of issues. In our study, we set ourselves the goal of investigating whether online reviews reveal usability and user experience (UUX) issues, being important quality-in-use characteristics. To address this problem, we used sentiment analysis techniques, with the aim of extracting relevant keywords from eWOM WhatsApp data. Based on the extracted keywords, we next identified the original users' reviews, and individually assigned each attribute and dimension to them. Eventually, the reported issues were thematically synthesized into 7 attributes and 8 dimensions. If one asks whether online reviews reveal genuine UUX issues, in this case, the answer is definitely affirmative.
17
Content available remote Deriving workflow privacy patterns from legal documents
EN
The recent General Data Protection Regulation (GDPR) has strengthened the importance of data privacy and protection for enterprises offering their services in the EU. Important part of intensified efforts toward better privacy protection is enterprise workflow redesign. It has been already found that the privacy level can be raised with applying the privacy by design principle when re(designing) workflows. A conforming and promising approach is to model privacy relevant workflow fragments as Workflow Privacy Patterns (WPPs) which provide abstract, ‘best practices‘ solution proposals to problems recurring in privacy-aware workflows. WPPs are intended to support process developers, auditors and privacy officers by providing pre-validated patterns that correspond with existing data privacy regulations. However, it is unclear yet how to obtain WPPs with an appropriate level of detail. In this paper, we will introduce our approach to derive WPPs from legal texts and other descriptive regulations. We propose a structure of a WPP, which we derive from pattern approaches from other research areas. We also show the steps for designing a WPP. We think that this approach can be an valuable input towards supporting privacy in enterprises.
18
EN
Corporate reputation is an economic asset and its accurate measurement is of increasing interest in practice and science. This measurement task is difficult because reputation depends on numerous factors and stakeholders. Traditional measurement approaches have focused on human ratings and surveys, which are costly, can be conducted only infrequently and emphasize financial aspects of a corporation. Nowadays, online media with comments related to products, services, and corporations provides an abundant source for measuring reputation more comprehensively. Against this backdrop, we propose an information retrieval approach to automatically collect reputation-related text content from online media and analyze this content by machine learning-based sentiment analysis. We contribute an ontology for identifying corporations and a unique dataset of online media texts labelled by corporations' reputation. Our approach achieves an overall accuracy of 84.4%. Our results help corporations to quickly identify their reputation from online media at low cost.
PL
W artykule przedstawiono rezultaty badań metod, które pomogą tworzyć bazę danych o użytkownikach z wykorzystaniem serwisów internetowych jako podstawowego źródła informacji. Celem badania było pozyskanie danych o użytkownikach z sieci społecznościowej, sprawdzenie możliwości parsera oraz analiza efektywności wykorzystania utworzonej bazy danych. W badaniach zostały wykorzystane metody analizy tekstowej: parsing i scraping. Wyniki zostały przedstawione w postaci wykresów i poddane krytycznej analizie porównawczej.
EN
The article presents the results of a study that will help for user to create a end-user parameters database using other web-sites as the primary source. The main goal of the work is making analysis to obtain information about the user of social networks. Next goal is analysis of the gathering data and the possibility of their use. The experiment was done using two methods of text analysis: parsing and scraping. The results are presented graphically and critical compared to each other.
20
Content available Analysis of methods and means of text mining
EN
In Big Data era when data volume doubled every year analyzing of all this data become really complicated task, so in this case text mining systems, techniques and tools become main instrument of analyzing tones and tones of information, selecting that information that suit the best for your needs and just help save your time for more interesting thing. The main aims of this article are explain basic principles of this field and overview some interesting technologies that nowadays are widely used in text mining.
first rewind previous Strona / 2 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.