Wyniki wyszukiwania - BazTech

1

Analiza wybranych metod walidacji krzyżowej w programie RSES

Kołpacki Radosław

Studia i Materiały Informatyki Stosowanej

|

2024

|

T. 16, nr 1

PL

W artykule przeprowadzono analizę zbioru danych za pomocą dwóch metod walidacji krzyżowej. Wykorzystano program RSES do identyfikacji kluczowych właściwości i relacji w zbiorze. Wyniki wykazują wpływ niektórych parametrów na potencjalną dokładność wyników.

EN

This article presents an analysis of a dataset using two cross-validation methods. The RSES program was employed to identify key properties and relationships within the dataset. The results indicate the impact of certain parameters on the potential accuracy of the outcomes.

2

Analiza wybranych metod walidacji krzyżowej w programie RSES

Bethke Beata

Studia i Materiały Informatyki Stosowanej

|

2024

|

T. 16, nr 1

11--14

PL

W artykule przeprowadzono analizę zbioru danych za pomocą dwóch metod walidacji krzyżowej. Wykorzystano program RSES do identyfikacji kluczowych właściwości i relacji w zbiorze. Wyniki wykazują wpływ niektórych parametrów na potencjalną dokładność wyników.

EN

This article presents an analysis of a dataset using two cross-validation methods. The RSES program was employed to identify key properties and relationships within the dataset. The results indicate the impact of certain parameters on the potential accuracy of the outcomes.

3

A crossvalidation-based comparison of kriging and IDW in local GNSS/levelling quasigeoid modelling

Ligas Marcin, Lucki Blazej, Banasik Piotr

Reports on Geodesy and Geoinformatics

|

2022

|

Vol. 114

1--7

EN

This study compares two interpolation methods in the problem of a local GNSS/levelling (quasi) geoid modelling. It uses raw data, no global geopotential model is involved. The methods differ as to the complexity of modelling procedure and theoretical background, they are ordinary kriging/least-squares collocation with constant trend and inverse distance weighting (IDW). The comparison itself was done through leave-one-out and random (Monte Carlo) cross-validation. Ordinary kriging and IDW performance was tested with a local (using limited number of data) and global (using all available data) neighbourhoods using various planar covariance function models in case of kriging and various exponents (power parameter) in case of IDW. For the study area both methods assure an overall accuracy level, measured by mean absolute error, root mean square error and median absolute error, of less than 1 cm. Although the method of IDW is much simpler, a suitably selected parameters (also trend removal) may contribute to differences between methods that are virtually negligible (fraction of a millimetre).

4

Development of a precise local quasigeoid model for the city of Krakow - QuasigeoidKR2019

Banasik Piotr, Bujakowski Kazimierz, Kudrys Jacek, Ligas Marcin

Reports on Geodesy and Geoinformatics

|

2020

|

Vol. 109

25--31

EN

A geoid or quasigeoid model allows the integration of satellite measurements with ground levelling measurements in valid height systems. A precise quasigeoid model has been developed for the city of Krakow. One of the goals of the model construction was to provide a more detailed quasigeoid course than the one offered by the national model PL-geoid2011. Only four measurement points in the area of Kraków were used to build a national quasigeoid model. It can be assumed that due to the small number of points and their uneven distribution over the city area, the quasigeoid can be determined less accurately. It became the reason for developing a local quasigeoid model based on a larger number of evenly distributed points. The quasigeoid model was based on 66 evenly distributed points (from 2.5 km to 5.0 km apart) in the study area. The process of modelling the quasigeoid used height anomalies determined at these points on the basis of normal heights derived through levelling and ellipsoidal heights derived through GNSS surveys. Height anomalies coming from the global geopotential model EGM2008 served as a long-wavelength trend in those derived from surveys. Analyses showed that the developed height anomaly model fits the empirical data at the level of single millimetres – mean absolute difference 0.005 m. The developed local model QuasigeoidKR2019, similar to the national model PL-geoid2011, are models closely related to the reference and height systems in Poland. Such models are used to integrate GNSS and levelling observations. A comparison of the local QuasigeoidKR2019 and national PL-geoid2011 model was made for the reference frame PL-ETRF2000 and height datum PL-KRON86-NH. The comparison of the two models with respect to GNSS/levelling height anomalies shows a triple reduction in the values of individual quartiles and a mean absolute difference for the developed local model. These summary statistics clearly indicate that the accuracy of the local model for the city of Krakow is significantly higher than that of the national one.

5

Modelowanie geostatystyczne w wyznaczaniu przestrzennego rozkładu parametrów petrofizycznych utworów ilasto-mułowcowych

Kwilosz Tadeusz

Nafta-Gaz

|

2020

|

R. 76, nr 4

230--238

PL

W związku ze zmniejszającymi się zasobami gazu w złożach zlokalizowanych w utworach piaskowcowych i skałach węglanowych coraz większym zainteresowaniem cieszą się obiekty strukturalne związane z utworami piaskowcowo-mułowcowo-łupkowymi ulokowane w utworach miocenu. Złoża te charakteryzują się dużą zmiennością facjalną tych utworów, w tym wyklinowaniem się lamin i warstewek. W związku z powyższym porowatość oznaczona dla tych skał nie może być liczona do bilansu porowatości efektywnej, służącej do oszacowania zasobów wydobywalnych. Obliczając miąższość efektywną, powinno się pominąć miąższość skał łupkowych. Jednym z zasadniczych problemów, będących przedmiotem tego opracowania, jest wydzielenie miąższości skał należących do poszczególnych typów litologicznych. W celu rozpoznania zmienności litologicznej tego typu skał użyto pomiarów profilowania gamma. Należy jednak podkreślić, że ze względu na bardzo małą miąższość warstw piaskowcowych i mułowcowych nie są one możliwe do jednoznacznej identyfikacji za pomocą profilowania gamma. Profilowanie gamma wskazuje raczej wartości uśrednione w obrębie pomiaru niż konkretne warstwy. W związku z tym dokonano oszacowania procentowego udziału miąższości każdego z typów litologicznych w całkowitej miąższości struktury przewierconej przez każdy odwiert. Podział taki przeprowadzono za pomocą analizy rozkładów empirycznych pomiarów gamma. Założono, że niejednorodności litologiczne uwidocznią się na wykresach histogramów w formie rozkładów wielomodalnych lub wyraźnych zmian monotoniczności. Dane te posłużyły do wygenerowania przestrzennych rozkładów parametrów na siatce modelu w punktach nieobjętych pomiarem. Użyto do tego celu metody krigingu zwyczajnego oraz krigingu z trendem. Przy zastosowaniu tych samych metod interpolacyjnych wygenerowano mapy stropu i spągu modelu badanego złoża. Korzystając z wyników pomiarów porowatości, dokonano przyporządkowania wyniku tych pomiarów do wydzielonych wcześniej typów litologicznych. Dla każdego typu skały i przy użyciu obydwu metod krigingu wyznaczono rozkład porowatości w punktach siatki modelu złoża. Do oszacowania niepewności uzyskanych wyników posłużono się metodą kroswalidacji. Na zakończenie obliczono objętość porów efektywnych modelu złoża, zakładając, że tylko skały piaskowcowe i mułowcowe są źródłem porów efektywnych. Oszacowano niepewność uzyskanego wyniku. Na rzecz opracowania posłużono się rzeczywistymi danymi z pomiarów geofizycznych dla złoża w utworach miocenu.

EN

Due to decreasing gas reserves in deposits found in sandstone f and carbonate formations, geological structures composed of mudstone and shale layers found in Miocene formations are growing in popularity. These deposits are characterized by high facial variability. Shale rocks have very low permeability. Therefore, the porosity determined for these rocks cannot be taken into account in the balance of the effective porosity used to estimate the natural gas reserves. Shale rock thickness should be excluded when calculating effective thickness of the gas bearing formations. The main problem of this study is the proper separation of the rocks belonging to individual lithological facies. Gamma ray logs were used to identify lithological variability of this type of rock. It should be emphasized, however, that due to the very low thickness of sandstones and mudstones, it was not possible to identify them clearly using the archival gamma ray logs. The archival gamma ray logs indicate the average values of the layer rather than that of specific laminas, because of its measurement resolution. Therefore, an estimation was made in order to calculate the percentage share of the thickness of each lithological type in the total layer thickness in each well. This division was made using the analysis of empirical distributions of gamma ray logs. It was assumed that lithological heterogeneities would be visible on experimental histograms in the form of multimodal distributions or clear changes in monotonicity. These data were used to generate a spatial distribution of parameters on the model grid at points not covered by the measurement. For this purpose, the ordinary kriging method and the kriging with trend method were used. Using the same interpolation methods, the structure maps of the top and bottom of the model were generated. The results of the porosity measurements were assigned to the previously separated lithological types of rocks. The distribution of porosity at the grid points of the deposit model was determined for each type of rock and using both kriging methods. The cross-validation method was used to assess the uncertainty of the results. Finally, the effective pore volume of the deposit model was calculated, assuming that only sandstone and mudstone rocks are sources of the effective pores. The uncertainty of the analysis was estimated. Real data from geophysical measurements for the Miocene gas field were used for the study.

6

Alternative Approach to Evaluating Interpolation Methods of Small and Imbalanced Data Sets

Gonet T., Gonet K.

Geomatics and Environmental Engineering

|

2017

|

Vol. 11, no. 3

49--65

PL

Badania dotyczą alternatywnego podejścia do oceny jakości metod interpolacji niewielkich i zróżnicowanych zestawów danych. Podstawowa analiza statystyczna oparta na klasycznej walidacji krzyżowej nie zawsze daje jednoznaczne wnioski. W przypadku analizowanego zestawu danych (niezgodnego z rozkładem normalnym) trzy metody interpolacji zostały wybrane jako najlepsze (zgodnie z procedurą klasycznej walidacji krzyżowej). Niemniej jednak mapy powstałe na podstawie tych metod wyraźnie się od siebie różnią. To jest powód, dla którego dogłębna analiza statystyczna była konieczna. Zaproponowano alternatywne podejście do tego zagadnienia, które uwzględnia szersze spektrum parametrów opisujących badany zestaw danych. Głównym założeniem tej metodyki jest porównanie nie tylko odchylenia standardowego estymatora, ale również trzech dodatkowych parametrów. To powoduje, iż końcowa ocena jest znacznie dokładniejsza. Analizę wykonano za pomocą programu Surfer (Golden Software). Zapewnia on możliwość wykorzystania wielu metod interpolacji wraz z różnorakimi, regulowanymi parametrami.

EN

The research concerns an alternative approach to the evaluation of interpolation methods for mapping small and imbalanced data sets. A basic statistical analysis of the standard cross-validation procedure is not always conclusive. In the case of the investigated data set (which is inconsistent with normal distribution), three interpolation methods have been selected as the most reliable (according to standard cross-validation). However, maps resulting from the aforementioned methods clearly differ from each other. This is the reason why a comprehensive statistical analysis of the studied data is a necessity. We propose an alternative approach that evaluates a broadened scope of parameters describing the data distribution. The general idea of the methodology is to compare not only the standard deviation of the estimator but also three additional parameters to make the final assessment much more accurate. The analysis has been carried out with the use of Golden Software Surfer. It provides a wide range of interpolation methods and numerous adjustable parameters.

7

A Simple Method for Calculating the Detonation Pressure of Ideal and Non-Ideal Explosives Containing Aluminum and Ammonium Nitrate

Jafari M., Keshavarz M. H.

Central European Journal of Energetic Materials

|

2017

|

Vol. 14, no. 4

966--983

EN

A general and simple method has been developed for calculating the detonation pressure of different kinds of ideal and non-ideal explosives containing aluminum (Al) and ammonium nitrate (AN). The new model can be applied to CHNO and CHNOFCl explosives in pure form or as mixtures as well as non-ideal mixed explosives including Al and AN. It can also be used for different plastic bonded explosives (PBXs). There is no need for any prior knowledge about the measured or calculated properties of the explosive. The only data needed are the standard enthalpy of formation and the loading density of the desired explosive. The predicted detonation pressures were compared with other predictive methods and outputs of BKWS-EOS, in both full and partial equilibrium. Different statistical parameters as well as cross validation parameters showed that the new model is precise, accurate, well-defined, and robust for predicting the detonation pressures of CHNOFCl(Al/AN) energetic materials.

8

Music Performers Classification by Using Multifractal Features : A Case Study

Reljin N., Pokrajac D.

Archives of Acoustics

|

2017

|

Vol. 42, No. 2

223--233

EN

In this paper, we investigated the possibility to classify different performers playing the same melodies at the same manner being subjectively quite similar and very difficult to distinguish even for musically skilled persons. For resolving this problem we propose the use of multifractal (MF) analysis, which is proven as an efficient method for describing and quantifying complex natural structures, phenomena or signals. We found experimentally that parameters associated to some characteristic points within the MF spectrum can be used as music descriptors, thus permitting accurate discrimination of music performers. Our approach is tested on the dataset containing the same songs performed by music group ABBA and by actors in the movie Mamma Mia. As a classifier we used the support vector machines and the classification performance was evaluated by using the four-fold cross-validation. The results of proposed method were compared with those obtained using mel-frequency cepstral coefficients (MFCCs) as descriptors. For the considered two-class problem, the overall accuracy and F-measure higher than 98% are obtained with the MF descriptors, which was considerably better than by using the MFCC descriptors when the best results were less than 77%.

9

Geostatistical analysis of variability of silica dioxide content within limestone deposit

Świtoń J. M.

Mining Science

|

2015

|

Vol. 22, Special Issue 2

181--193

EN

In the following paper, the geostatistical analysis of qualitative parameter within a limestone deposit was presented. The parameter was content of silica dioxide. Geostatistical analysis was carried out in order to identify variability of the parameter, what significantly influenced ore exploration. Sampling data was considered with regards to descriptive statistics; logarithmical character of parameter’s distribution was indicated. After logarithmical transformation omnidirectional semivariograms were calculated due to the fact that directional anisotropy was not proven. Few theoretical models were fitted to the semivariogram, further on they were verified by means of cross-validation method. Estimation results were obtained by lognormal ordinary kriging technique. They did not confirm that models classified during cross-validation as best fit are also most reliable during estimation. It is recommended to continue research on variability of parameters within the limestone deposit, including analysis conducted by indicator kriging technique. All stages of geostatistical analysis were carried out in Isatis software.

10

Local height transformation through polynomial regression

Ligas M., Banasik P.

Geodesy and Cartography

|

2012

|

Vol. 61, no. 1

3--17

EN

The paper presents results of the transformation between two height systems Kronstadt’60 and Kronstadt’86 within the area of Krakow’s district, the latter system being nowadays a part of National Spatial Reference System in Poland. The transformation between the two height systems was carried out based on the well known and frequently applied in geodesy polynomial regression. Despite the fact it is well known and frequently applied it is rather seldom broader tested against the optimal degree of a polynomial function, goodness of fit and its predictive capabilities. In this study some statistical tests, measures and techniques helpful in analyzing a polynomial transformation function (and not only) have been used.

PL

W artykule przedstawiono wyniki transformacji wysokości miedzy układami Kronsztadt’60 i Kronsztadt’ 86 na obszarze powiatu krakowskiego. Ostatni z wymienionych układów jest obecnie częścią obowiązującego w Polsce Państwowego Systemu Odniesień Przestrzennych. Transformacja miedzy wymienionymi układami wysokości została wykonana w oparciu o dobrze znana i często stosowana w geodezji regresje wielomianowa. Mimo jej powszechności w zastosowaniach rzadziej można spotkać w literaturze jej szersza analizę pod względem optymalnego stopnia wielomianu, jakości dopasowania oraz zdolności predykcyjnych. W niniejszym opracowaniu wykorzystano różne metody w celu uzyskania statystycznej pewności co do poprawności i praktycznej użyteczności opisywanego modelu.

11

Projektowanie sztucznych sieci neuronowych w zagadnieniu identyfikacji parametrów geometrycznych łuków

Kłos M.

Zeszyty Naukowe Politechniki Rzeszowskiej. Budownictwo i Inżynieria Środowiska

|

2010

|

z.57[271], nr 2

21-28

PL

Zaproponowano dwie metody projektowania sztucznych sieci neuronowych (SSN) do identyfikacji parametrów geometrycznych łuków. Jedna z metod to powszechnie stosowana metoda walidacji krzyżowej, w której poszukuje się minimum funkcji błędu. Druga to nowoczesna metoda MML - maksimum całkowitej wiarygodności, oparta na podejściu bayesowskim. W celu porównania obu metod przeanalizowano sieci kaskadowe, w których wejście stanowiło zawsze sześć podstawowych częstości drgań własnych i w każdym kroku kaskady otrzymane z uczenia sieci parametry geometryczne łuku. Wyjściem zawsze był tylko jeden parametr geometryczny. Otrzymane wyniki potwierdzają skuteczność metody MML, stosowanej zamiast metody walidacji krzyżowej bezpośrednio na całym zbiorze danych, bez wielokrotnych powtórzeń.

EN

Two methods were proposed for design of artificial neural networks (ANN) of the identification of the shape parameters for the arches. One of methods is applied universally method of the cross-validation in which we seek the minimum of the function of the mistake. Second is modem method MML Maximum of Marginal Likelihood taken from Bayesian approach. In the paper the design of ANN is related to searching of an optimal value of the number of neurons H in the hidden layer of network. It is illustrated on six numerical examples. In these problems the input vector always composed of the first six eigenfrequencies and made up the plus in every the step the cascade one the shape parameter. The obtained results enable to formulate a conclusion the criterion MLM can be used instead of the cross-validation method. This conclusion if of practical value, since it permits to design ANNs without formulation of a test set of patterns.

12

Probabilities of discrepancy between minima of cross-validation, Vapnik bounds and true risks

Klęsk P.

International Journal of Applied Mathematics and Computer Science

|

2010

|

Vol. 20, no 3

525-544

EN

Two known approaches to complexity selection are taken under consideration: n-fold cross-validation and structural risk minimization. Obviously, in either approach, a discrepancy between the indicated optimal complexity (indicated as the minimum of a generalization error estimate or a bound) and the genuine minimum of unknown true risks is possible. In the paper, this problem is posed in a novel quantitative way. We state and prove theorems demonstrating how one can calculate pessimistic probabilities of discrepancy between these minima for given for given conditions of an experiment. The probabilities are calculated in terms of all relevant constants: the sample size, the number of cross-validation folds, the capacity of the set of approximating functions and bounds on this set. We report experiments carried out to validate the results.

13

Facial Emotion Classification Using Active Appearance Model and Support Vector Machine Classifier

Beszédeš M., Culverhouse P., Oravec M.

Machine Graphics and Vision

|

2009

|

Vol. 18, No. 1

21-46

EN

Automatic analysis of human face expression is an interesting and non-trivial problem. In the last decade, many approaches have been described for emotion recognition based on analysis of facial expression. However, little has been done in the sub-area of the recognition of facial emotion intensity levels. This paper proposes the analysis of the use of Active Appearance Models (AAMs) and Support Vector Machine (SVM) classifiers in the recognition of human facial emotion and emotion intensity levels. AAMs are known as a tool for statistical modeling of object shape/appearance or for precise object feature detection. In our case, we examine their properties as a technique for feature extraction. We analyze the influence of various facial feature data types (shape / texture / combined AAM parameter vectors) and the size of facial images on the final classification accuracy. Then, approaches to proper C-SVM classifiers (RBF kernel) training parameter adjustment are described. Moreover, an alternative way of classification accuracy evaluation using the human visual system as a reference point is discussed. Unlike the usual to the approach evaluation of recognition algorithms (based on comparison of final classification accuracies), the proposed evaluation schema is independent of the testing set parameters, such as number, age and gender of subjects or the intensity of their emotions. Finally, we show that our automatic system gives emotion categories for images more consistent labels than human subjects, while humans are more consistent in identifying emotion intensity level compared to our system.

14

Maksimum całkowitej wiarygodności zamiast walidacji krzyżowej w projektowaniu sztucznych sieci neuronowych

Waszczyszyn Z., Słoński M.

Zeszyty Naukowe Politechniki Rzeszowskiej. Budownictwo i Inżynieria Środowiska

|

2007

|

z. 45 {243}

173-185

PL

Metoda krzyżowej walidacji jest powszechnie stosowana do projektowania Sztucznych Sieci Neuronowych (SSN). W pracy projektowanie odnosi się do obliczania optymalnych wartości parametru regularyzacji lub liczby neuronów w warstwie ukrytej SSN. Metoda krzyżowej walidacji opiera się na obliczaniu wartości minimalnej krzywej walidacji, gdyż krzywa uczenia jest funkcją monotonicznie malejącą wymienionych parametrów regularyzacji. Celem zmiany kryterium projektowania SSN oparto się na krzywej maksymalnej wiarygodności, stosowanej w podejściu bayesowskim. W kryterium MML (Maximum Marginal Likelihood) oblicza się maksimum funkcji całkowitej wiarygodności lnCW(PR; L), gdzie CW jest prawdopodobieństwem całkowitej wiarygodności, a L liczebnością zbioru uczącego. Efektywność proponowanego podejścia wykazano na dwóch przykładach liczbowych. Otrzymane wyniki prowadzą do wniosku, że kryterium MML może być stosowane zamiast metody krzyżowej walidacji. Taki wniosek ma znaczenie praktyczne, zwłaszcza w przypadku małych zbiorów danych, gdyż umożliwia projektowanie SSN bez formułowania zbioru walidującego.

EN

The cross-validation method is commonly applied in the design of Artificial Neural Networks (ANNs). In the paper the design of ANN is related to searching of an optimal value of the regression parameter or the number of neurons in the hidden layer of network. The cross-validation error has a minimal value, vs. the training error curve which is monotonically decreasing. In order to change the design criterion, the marginal likelihood curve, taken from Bayesian approach, can be used. A corresponding formula for the plotting of the curve is shortly discussed. The criterion MML (Maximum Marginal Likelihood), applied to find optimal values of design parameters, is illustrated on two numerical examples. The obtained results enable us to formulate a conclusion that the criterion MLM can be used instead of the cross-validation method. This conclusion if of practical value (especially for small data sets), since it permits to design ANNs without formulation of a validation set of patterns.

15

Interpolation Models for Spatiotemporal Association Mining

Li D., Deogun J.S.

Fundamenta Informaticae

|

2004

|

Vol. 59, nr 2,3

153--172

EN

In this paper, we investigate interpolation methods that are suitable for discovering spatiotemporal association rules for unsampled sites with a focus on drought risk management problem. For drought risk management, raw weather data is collected, converted to various indices, and then mined for association rules. To generate association rules for the unsampled sites, interpolation methods can be applied at any stage of this data mining process. We develop and integrate three interpolation models into our association rule mining algorithm. We call them pre-order, in-order and post-order interpolation models. The performance of these three models is experimentally evaluated comparing the interpolated association rules with the rules discovered from actual raw data based on two metrics, precision and recall. Our experiments show that the post-order interpolation model provides the highest precision among the three models, and the Kriging method in the pre-order interpolation model presents the highest recall.

16

Cross-validation techniques in the practical problem of the choice of the regression estimator

Grzybowski A.

Prace Naukowe Instytutu Matematyki i Informatyki Politechniki Częstochowskiej

|

2003

|

Vol. 2, nr 1

39--44

EN

The paper is devoted to the problem of a choice between various regression estimators in real world applications. We emphasise the role of cross-validation techniques when doing such a choice in actual usage, especially in the situations where theoretical assumption about considered problem are difficult to verify and the aim of the model building is the prediction of future values of the response variable.

17

A Comparision of Different Decision Algorithms Used in Volumetric Storm Cells Classification

Suraj Z., Peters J.F., Rząsa W.

Fundamenta Informaticae

|

2002

|

Vol. 51, nr 1,2

201-214

EN

Decision algorithms useful in classifying meteorological volumetric radar data are the subject of described in the paper experiments. Such data come from the Radar Decision Support System (RDSS) database of Environment Canada and concern summer storms created in this country. Some research groups used the data completed by RDSS for verifying the utility of chosen methods in volumetric storm cells classification. The paper consists of a review of experiments that were made on the data from RDSS database of Environment Canada and presents the quality of particular classifiers. The classification accuracy coefficient is used to express the quality. For five research groups that led their experiments in a similar way it was possible to compare received outputs. Experiments showed that the Support Vector Machine (SVM) method and rough set algorithms which use object oriented reducts for rule generation to classify volumetric storm data perform better than other classifiers.