Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 7

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  data streams
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
In recent years, many deep learning methods, allowed for a significant improvement of systems based on artificial intelligence methods. Their effectiveness results from an ability to analyze large labeled datasets. The price for such high accuracy is the long training time, necessary to process such large amounts of data. On the other hand, along with the increase in the number of collected data, the field of data stream analysis was developed. It enables to process data immediately, with no need to store them. In this work, we decided to take advantage of the benefits of data streaming in order to accelerate the training of deep neural networks. The work includes an analysis of two approaches to network learning, presented on the background of traditional stochastic and batch-based methods.
2
Content available Exploring complex and big data
EN
This paper shows how big data analysis opens a range of research and technological problems and calls for new approaches. We start with defining the essential properties of big data and discussing the main types of data involved. We then survey the dedicated solutions for storing and processing big data, including a data lake, virtual integration, and a polystore architecture. Difficulties in managing data quality and provenance are also highlighted. The characteristics of big data imply also specific requirements and challenges for data mining algorithms, which we address as well. The links with related areas, including data streams and deep learning, are discussed. The common theme that naturally emerges from this characterization is complexity. All in all, we consider it to be the truly defining feature of big data (posing particular research and technological challenges), which ultimately seems to be of greater importance than the sheer data volume.
EN
Consumer brands often offer discounts to attract new shoppers to buy their products. The most valuable customers are those who return after this initial incentive purchase. With enough purchase history, it is possible to predict which shoppers, when presented an offer, will buy a new item. While dealing with Big Data and with data streams in particular, it is a common practice to summarize or aggregate customers’ transaction history to the periods of few months. As an outcome, we compress the given huge volume of data, and transfer the data stream to the standard rectangular format. Consequently, we can explore a variety of practically or theoretically motivated tasks. For example, we can rank the given field of customers in accordance to their loyalty or intension to repurchase in the near future. This objective has very important practical application. It leads to preferential treatment of the right customers. We tested our model (with competitive results) online during Kaggle-based Acquire Valued Shoppers Challenge in 2014.
EN
The recently deployed supercomputer Tryton, located in the Academic Computer Center of Gdansk University of Technology, provides great means for massive parallel processing. Moreover, the status of the Center as one of the main network nodes in the PIONIER network enables the fast and reliable transfer of data produced by miscellaneous devices scattered in the area of the whole country. The typical examples of such data are streams containing radio-telescope and satellite observations. Their analysis, especially with real-time constraints, can be challenging and requires the usage of dedicated software components. We propose a solution for such parallel analysis using the supercomputer, supervised by the KASKADA platform, which with the conjunction with immerse 3D visualization techniques can be used to solve problems such as pulsar detection and chronometric or oil-spill simulation on the sea surface.
5
Content available remote Incremental rule-based learners for handling concept drift: an overview
EN
Learning from non-stationary environments is a very popular research topic. There already exist algorithms that deal with the concept drift problem. Among them there are online or incremental learners, which process data instance by instance. Their knowledge representation can take different forms such as decision rules, which have not received enough attention in learning with concept drift. This paper reviews incremental rule-based learners designed for changing environments. It describes four of the proposed algorithms: FLORA, AQ11-PM+WAH, FACIL and VFDR. Those four solutions can be compared on several criteria, like: type of processed data, adjustment to changes, type of the maintained memory, knowledge representation, and others.
6
Content available remote Data warehouse for event streams violating rules
EN
In this presentation, we discuss how a data warehouse can support situational awareness and data forensic needs for investigation of event streams violating rules. The data warehouse for event streams can contain summary tables showing rule violation on different aggregation level. We will introduce the classification of rules and the concept of a general aggregation graph for defining various classes of rules violation and their relationships. The data warehouse system containing various rule violation aggregations will allow the data forensics experts to have the ability to “drill-down” into event data across different data warehouse dimensions. The event stream real-time processing and other software modules can also use the summarizations to discover if current events bursts satisfy rules by comparing them with historic event bursts.
7
Content available remote Content-based load shedding in multimedia data stream management system
EN
Overload management has become very important in public safety systems that analyse high performance multimedia data streams, especially in the case of detection of terrorist and criminal dangers. Efficient overload management improves the accuracy of automatic identification of persons suspected of terrorist or criminal activity without requiring interaction with them. We argue that in order to improve the quality of multimedia data stream processing in the public safety arena, the innovative concept of a Multimedia Data Stream Management System (MMDSMS) using load-shedding techniques should be introduced into the infrastructure to monitor and optimize the execution of multimedia data stream queries. In this paper, we present a novel content-centered load shedding framework, based on searching and matching algorithms, for analysing video tuples arriving within multimedia data streams. The framework tracks and registers all symptoms of overload, and either prevents overload before it occurs, or minimizes its effects. We have extended our Continuous Query Language (CQL) syntax to enable this load shedding technique. The effectiveness of the framework has been verified using both artificial and real data video streams collected from monitoring devices.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.