Wyniki wyszukiwania - BazTech

1

Impact of missing precipitation values on hydrological model output: a case study from the Eddleston Water catchment, Scotland

Ruman S., Krpec P., Rusnok P., Black A. R., Trizna M., Ball T.

Acta Geophysica

|

2020

|

Vol. 68, no. 2

565--576

EN

A hydrological model was applied to select the best inflling method of missing precipitation (1) and to assess the impact of the length of deleted and flled precipitation data (2). The model was calibrated and validated using the hourly observed discharges from two gauges located in the outlet of the catchment (62.34 km2 ) and in the inner sub-catchment (2.05 km2 ). Precipitation from four gauges was spatially interpolated over the overall catchment, while the sub-catchment used the precipitation from one gauge. Four scenarios of diferent lengths of deletion within three high-intensity events were established in the data of this gauge. Three inflling methods were applied and compared: substitution, linear regression and inverse distance weighting (IDW). Substitution showed the best results, followed by linear regression and IDW in both scales. Differences between methods were signifcant only in 8.3% and 19.4% of all cases (sub-catchment and catchment, respectively). The impact of length was assessed using the substitution only and by comparing diferences in discharges and performance statistics caused by four scenarios. Higher diferences in discharges were found on the catchment scale compared to the inner sub-catchment and were insignifcant for all events and scenarios. The hypothesis that a longer length of deleted and flled data would lead to a greater error in discharges was wrong for 11.1% and 16.7% of all cases (sub-catchment and catchment, respectively). In several cases (33.4% sub-catchment, 27.1% catchment), the model produced better results using the time series with flled gaps compared to the confguration with observed data.

2

Lookahead selective sampling for incomplete data

Abdallah L., Shimshoni I.

International Journal of Applied Mathematics and Computer Science

|

2016

|

Vol. 26, no. 4

871--884

EN

Missing values in data are common in real world applications. There are several methods that deal with this problem. In this paper we present lookahead selective sampling (LSS) algorithms for datasets with missing values. We developed two versions of selective sampling. The first one integrates a distance function that can measure the similarity between pairs of incomplete points within the framework of the LSS algorithm. The second algorithm uses ensemble clustering in order to represent the data in a cluster matrix without missing values and then run the LSS algorithm based on the ensemble clustering instance space (LSS-EC). To construct the cluster matrix, we use the k-means and mean shift clustering algorithms especially modified to deal with incomplete datasets. We tested our algorithms on six standard numerical datasets from different fields. On these datasets we simulated missing values and compared the performance of the LSS and LSS-EC algorithms for incomplete data to two other basic methods. Our experiments show that the suggested selective sampling algorithms outperform the other methods.

3

Clustering with Missing Values

Simiński K.

Fundamenta Informaticae

|

2013

|

Vol. 123, nr 3

331--350

EN

The paper presents the clustering algorithm for data with missing values. In this approach both marginalisation and imputation are applied. The result of the clustering is the type-2 fuzzy set / rough fuzzy set. This approach enables the distinction between original and imputed data. The method can be applied to the data sets with all attributes lacking some values. The paper is accompanied by the numerical examples of clustering of synthetic and real-life data sets.

4

Neuro-rough-fuzzy approach for regression modelling from missing data

Simiński K.

International Journal of Applied Mathematics and Computer Science

|

2012

|

Vol. 22, no. 2

461-476

EN

Real life data sets often suffer from missing data. The neuro-rough-fuzzy systems proposed hitherto often cannot handle such situations. The paper presents a neuro-fuzzy system for data sets with missing values. The proposed solution is a complete neuro-fuzzy system. The system creates a rough fuzzy model from presented data (both full and with missing values) and is able to elaborate the answer for full and missing data examples. The paper also describes the dedicated clustering algorithm. The paper is accompanied by results of numerical experiments.

5

Stability of Optimal Parameters for Classifier Based on Simple Granules of Knowledge

Artiemjew P.

Technical Sciences / University of Warmia and Mazury in Olsztyn

|

2011

|

nr 14(1)

57-69

EN

Searching for optimal parameters of a classifier based on simple granules of knowledge investigated recently by the author (ARTIEMJEW 2010) raises a question about stability of optimal parameters. In this article, we will check dependence of stability of the optimal radius of granulation on random damage of decision system. The results of experiments show the dependence of stability on size of damage and strategies of treating missing values. This kind of research aims at finding methods of protecting decision systems which are vulnerable to damage against decreasing their classification effectiveness, which means preserving classifying possibilities similar to undamaged decision systems.

PL

Przeprowadzone w ostatnim czasie badania (ARTIEMJEW 2010) zmierzające do wyszukiwania optymalnych parametrów klasyfikacji modułów decyzyjnych opartych na prostych granulach wiedzy zrodziły pytanie o stabilność optymalnych parametrów klasyfikacji. W pracy sprawdzono zależność stabilności optymalnych promieni granulacji od losowego uszkadzania systemu decyzyjnego. Wyniki badań wskazały jednoznacznie, że istnieje zależność między stabilnością a wielkością uszkodzenia i strategiami traktowania wartości uszkodzonych. Tego typu badania mają na celu szukanie metod zabezpieczania systemów decyzyjnych, które są podatne na uszkodzenia, przed zmniejszaniem ich efektywności klasyfikacyjnej. Celem było zachowanie możliwości klasyfikacyjnych zbliżonych do efektywności nieuszkodzonych systemów decyzyjnych.

6

Child well being estimation as a multi-criteria decision problem

Adamus E., Klęsk P., Kołodziejczyk J., Korzeń M., Piegat A., Pluciński M.

Metody Informatyki Stosowanej

|

2011

|

nr 4

279-286

EN

The aim of the article is to establish a list of capabilities with reference to children’s well being in Poland. This issue can be considered as a multicriterial classification problem where decision attributes classes are ordered due to the preference – criteria. We used a Dominance-based Rough Set Approach (DRSA) adapted to deal with missing values. Analysis was performed on data set collected in Zachodniopomorskie district as a result of survey conduction.

7

Computer-Intensive Methods in Traffic Safety Research

Harold S.

International Journal of Occupational Safety and Ergonomics

|

2002

|

Vol. 8, No. 3

353--363

EN

The analysis of traffic safety data archives has improved markedly with the development of procedures that are heavily dependent upon computers. Three such procedures are described here. The first procedure involves using computers to assist in the identification and correction of invalid data. The second procedure makes greater computational demands, and involves using computerized algorithms to fill in the ‘‘gaps’’ that typically occur in archival data when information regarding key variables is not available. The third and most computer-intensive procedure involves using data mining techniques to search archives for interesting and important relationships between variables. These procedures are illustrated using examples from data archives that describe the characteristics of traffic accidents in the USA and Australia.