Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 7

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  missing values
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
A hydrological model was applied to select the best inflling method of missing precipitation (1) and to assess the impact of the length of deleted and flled precipitation data (2). The model was calibrated and validated using the hourly observed discharges from two gauges located in the outlet of the catchment (62.34 km2 ) and in the inner sub-catchment (2.05 km2 ). Precipitation from four gauges was spatially interpolated over the overall catchment, while the sub-catchment used the precipitation from one gauge. Four scenarios of diferent lengths of deletion within three high-intensity events were established in the data of this gauge. Three inflling methods were applied and compared: substitution, linear regression and inverse distance weighting (IDW). Substitution showed the best results, followed by linear regression and IDW in both scales. Differences between methods were signifcant only in 8.3% and 19.4% of all cases (sub-catchment and catchment, respectively). The impact of length was assessed using the substitution only and by comparing diferences in discharges and performance statistics caused by four scenarios. Higher diferences in discharges were found on the catchment scale compared to the inner sub-catchment and were insignifcant for all events and scenarios. The hypothesis that a longer length of deleted and flled data would lead to a greater error in discharges was wrong for 11.1% and 16.7% of all cases (sub-catchment and catchment, respectively). In several cases (33.4% sub-catchment, 27.1% catchment), the model produced better results using the time series with flled gaps compared to the confguration with observed data.
2
Content available Lookahead selective sampling for incomplete data
EN
Missing values in data are common in real world applications. There are several methods that deal with this problem. In this paper we present lookahead selective sampling (LSS) algorithms for datasets with missing values. We developed two versions of selective sampling. The first one integrates a distance function that can measure the similarity between pairs of incomplete points within the framework of the LSS algorithm. The second algorithm uses ensemble clustering in order to represent the data in a cluster matrix without missing values and then run the LSS algorithm based on the ensemble clustering instance space (LSS-EC). To construct the cluster matrix, we use the k-means and mean shift clustering algorithms especially modified to deal with incomplete datasets. We tested our algorithms on six standard numerical datasets from different fields. On these datasets we simulated missing values and compared the performance of the LSS and LSS-EC algorithms for incomplete data to two other basic methods. Our experiments show that the suggested selective sampling algorithms outperform the other methods.
3
Content available remote Clustering with Missing Values
EN
The paper presents the clustering algorithm for data with missing values. In this approach both marginalisation and imputation are applied. The result of the clustering is the type-2 fuzzy set / rough fuzzy set. This approach enables the distinction between original and imputed data. The method can be applied to the data sets with all attributes lacking some values. The paper is accompanied by the numerical examples of clustering of synthetic and real-life data sets.
EN
Real life data sets often suffer from missing data. The neuro-rough-fuzzy systems proposed hitherto often cannot handle such situations. The paper presents a neuro-fuzzy system for data sets with missing values. The proposed solution is a complete neuro-fuzzy system. The system creates a rough fuzzy model from presented data (both full and with missing values) and is able to elaborate the answer for full and missing data examples. The paper also describes the dedicated clustering algorithm. The paper is accompanied by results of numerical experiments.
EN
Searching for optimal parameters of a classifier based on simple granules of knowledge investigated recently by the author (ARTIEMJEW 2010) raises a question about stability of optimal parameters. In this article, we will check dependence of stability of the optimal radius of granulation on random damage of decision system. The results of experiments show the dependence of stability on size of damage and strategies of treating missing values. This kind of research aims at finding methods of protecting decision systems which are vulnerable to damage against decreasing their classification effectiveness, which means preserving classifying possibilities similar to undamaged decision systems.
PL
Przeprowadzone w ostatnim czasie badania (ARTIEMJEW 2010) zmierzające do wyszukiwania optymalnych parametrów klasyfikacji modułów decyzyjnych opartych na prostych granulach wiedzy zrodziły pytanie o stabilność optymalnych parametrów klasyfikacji. W pracy sprawdzono zależność stabilności optymalnych promieni granulacji od losowego uszkadzania systemu decyzyjnego. Wyniki badań wskazały jednoznacznie, że istnieje zależność między stabilnością a wielkością uszkodzenia i strategiami traktowania wartości uszkodzonych. Tego typu badania mają na celu szukanie metod zabezpieczania systemów decyzyjnych, które są podatne na uszkodzenia, przed zmniejszaniem ich efektywności klasyfikacyjnej. Celem było zachowanie możliwości klasyfikacyjnych zbliżonych do efektywności nieuszkodzonych systemów decyzyjnych.
EN
The aim of the article is to establish a list of capabilities with reference to children’s well being in Poland. This issue can be considered as a multicriterial classification problem where decision attributes classes are ordered due to the preference – criteria. We used a Dominance-based Rough Set Approach (DRSA) adapted to deal with missing values. Analysis was performed on data set collected in Zachodniopomorskie district as a result of survey conduction.
7
EN
The analysis of traffic safety data archives has improved markedly with the development of procedures that are heavily dependent upon computers. Three such procedures are described here. The first procedure involves using computers to assist in the identification and correction of invalid data. The second procedure makes greater computational demands, and involves using computerized algorithms to fill in the ‘‘gaps’’ that typically occur in archival data when information regarding key variables is not available. The third and most computer-intensive procedure involves using data mining techniques to search archives for interesting and important relationships between variables. These procedures are illustrated using examples from data archives that describe the characteristics of traffic accidents in the USA and Australia.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.