Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 5

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
w słowach kluczowych:  brakujące wartości
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
Content available Lookahead selective sampling for incomplete data
Missing values in data are common in real world applications. There are several methods that deal with this problem. In this paper we present lookahead selective sampling (LSS) algorithms for datasets with missing values. We developed two versions of selective sampling. The first one integrates a distance function that can measure the similarity between pairs of incomplete points within the framework of the LSS algorithm. The second algorithm uses ensemble clustering in order to represent the data in a cluster matrix without missing values and then run the LSS algorithm based on the ensemble clustering instance space (LSS-EC). To construct the cluster matrix, we use the k-means and mean shift clustering algorithms especially modified to deal with incomplete datasets. We tested our algorithms on six standard numerical datasets from different fields. On these datasets we simulated missing values and compared the performance of the LSS and LSS-EC algorithms for incomplete data to two other basic methods. Our experiments show that the suggested selective sampling algorithms outperform the other methods.
W pracy przedstawiona została metoda warunkowego uzupełniania niekompletnych danych dopełnieniami klas podobieństwa.
The problem of the incomplete data is quite common especially in the case of the actual measurement samples. In this connection, it has been vastly commented in the literaturt, especially in the rough set theory. The rough set theory was meant as a tool for imprecise and inconsistent information systems. The aim of this work is to supplement the incomplete data relying on the relations designed to this problem, (similarity and tolerance relation). Basing on the opposite information to the incomplete object we know the area of permitted values for this object. The method proposed in the article works on the assumption that we possess with the opposite information to the supplemented sample in our information system.
A hydrological model was applied to select the best inflling method of missing precipitation (1) and to assess the impact of the length of deleted and flled precipitation data (2). The model was calibrated and validated using the hourly observed discharges from two gauges located in the outlet of the catchment (62.34 km2 ) and in the inner sub-catchment (2.05 km2 ). Precipitation from four gauges was spatially interpolated over the overall catchment, while the sub-catchment used the precipitation from one gauge. Four scenarios of diferent lengths of deletion within three high-intensity events were established in the data of this gauge. Three inflling methods were applied and compared: substitution, linear regression and inverse distance weighting (IDW). Substitution showed the best results, followed by linear regression and IDW in both scales. Differences between methods were signifcant only in 8.3% and 19.4% of all cases (sub-catchment and catchment, respectively). The impact of length was assessed using the substitution only and by comparing diferences in discharges and performance statistics caused by four scenarios. Higher diferences in discharges were found on the catchment scale compared to the inner sub-catchment and were insignifcant for all events and scenarios. The hypothesis that a longer length of deleted and flled data would lead to a greater error in discharges was wrong for 11.1% and 16.7% of all cases (sub-catchment and catchment, respectively). In several cases (33.4% sub-catchment, 27.1% catchment), the model produced better results using the time series with flled gaps compared to the confguration with observed data.
The aim of the article is to establish a list of capabilities with reference to children’s well being in Poland. This issue can be considered as a multicriterial classification problem where decision attributes classes are ordered due to the preference – criteria. We used a Dominance-based Rough Set Approach (DRSA) adapted to deal with missing values. Analysis was performed on data set collected in Zachodniopomorskie district as a result of survey conduction.
Real life data sets often suffer from missing data. The neuro-rough-fuzzy systems proposed hitherto often cannot handle such situations. The paper presents a neuro-fuzzy system for data sets with missing values. The proposed solution is a complete neuro-fuzzy system. The system creates a rough fuzzy model from presented data (both full and with missing values) and is able to elaborate the answer for full and missing data examples. The paper also describes the dedicated clustering algorithm. The paper is accompanied by results of numerical experiments.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.