A new look on the principal component analysis has been presented. Firstly,ageometric interpretation of determination coefficient was shown. In turn, the ability to represent the analyzed data and their interdependencies in the form of easy-tounderstand basic geometric structures was shown. As a result of the analysis of these structures it was proposed to enrich the classical PCA. In particular, it was proposed a new criterion for the selection of important principal components and a new algorithm for clustering primary variables by their level of similarity to the principal components. Virtual and real data spaces, as well as tensor operations on data, have also been identified.The anisotropy of the data was identified too.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The author reminds the definition of coefficient determination and the idea of the largest credibility method in the introduction of the article. Firstly, the aspect of new coefficient defined by McFadden as a power of prediction process and called pseudo R2 was characterized. The similar type of R2 coefficients proposed by other known statisticians were described. Moreover, the hierarchic way of building the logistic regress model, through adding next variables to adjust an estimated model to empirical data was introduced. Four kinds of variables were analysed, whose adjustment influence on the estimated logistic regress model was affected by the quantity of inconsistences which appeared in the data result. The calculations of parameter R2 made by the McFadden’s, Nagelkerke’s and Cox-Snell’s formulas were presented. Concise conclusions relating to the estimated logistic regress model on the basis of empirical data from tested MD-8 fuses type were introduced in the end of the article and this model was compared to values of R2 coefficient counted in the article. It was stated, that McFadden’s pseudo R2 parameter is most often used and it defines the power of prediction process.
PL
W artykule scharakteryzowano postać nowego współczynnika określonego przez McFadden-a jako moc procesu predykcji i nazywanego pseudo R2. Przedstawiono praktyczny przykład określania mocy predykcji w oparciu o wybrane dane empiryczne. Zaprezentowano hierarchiczny sposób budowy modelu regresji logistycznej, poprzez dołączanie kolejnych zmiennych w celu dopasowania szacowanego modelu do danych empirycznych. Analizowano cztery rodzaje zmiennych, których wpływ na dopasowanie szacowanego modelu regresji logistycznej był uzależniony od ilości niezgodności jakie wystąpiły w wynikach danych. Przedstawiono obliczenia parametru R2 wykonane za pomocą wzorów McFadden-a, Nagelkerke-a i Cox-Snell-a. Na końcu artykułu przedstawiono zwięzłe wnioski dotyczące oszacowanego modelu regresji logistycznej na podstawie danych empirycznych z badanych zapalników typu MD-8 oraz porównano ten model do obliczonych w artykule wartości współczynnika R2. Stwierdzono, że parametr pseudo R2 McFadden’a jest najczęściej używany i określa on moc procesu predykcji.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
W artykule przeanalizowano możliwości aproksymacji odwrotnych charakterystyk termoanemometrów stałotemperaturowych (CTA) za pomocą różnych funkcji. Porównano wyniki aproksymacji z wykorzystaniem funkcji wielomianowych (3-go i 4-go stopnia), wykładniczych (eksponencjalnej, Stirliga oraz Gompertza) a także funkcji potęgowej. Wykorzystując wskaźniki jakości aproksymacji (takie jak współczynnik determinacji czy błąd średniokwadratowy) dokonano ilościowej oceny poszczególnych metod na bazie trzech różnych zestawów danych pochodzących z rzeczywistych wzorcowań termoanemometrów.
EN
Possibilities of approximation of inverse static characteristics of constant temperature hot-wire anemometers, by means of different functions were analyzed in the paper. The results of approximation with use of polynomials (3rd and 4th degree), exponential functions (exponential growth, Stirling and Gompertz function) as well as power function were presented and compared. Applying some coeffi cients describing the quality of approximation (like coefficient of determination or mean squared error), quantitative evaluation of particular methods was done, on the basis of three different data sets, originating from real hot wire calibrations.
4
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The analysis of electromyographic signals can be very time consuming. In designing a program for EMG signal analysis, there are two competing factors: the accuracy of the final result and its speed. In scientific work, accuracy is the most important factor. All of the existing decomposition programs used in neurophysiology require a final phase of manual corrections, if reliable results are to be obtained. This phase is considerably longer than the phase of automatic recognition. The solutions presented below, used in our new MUR program, allow for the accurate decomposition of complex EMG signals in a reasonable amount of time. The decomposition is performed interactively with optimal time division between automatic and manual tasks. All of this is achieved through a simple method of automatic recognition with the use of the modified coefficient of determination and the method of multiple subtractions of potentials.
5
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
This paper presents novel feature extraction and classification methods for online handwritten Chinese character recognition (HCCR). The X-graph and Y-graph transformation is proposed for deriving a feature, which shows useful properties such as invariance to different writing styles. Central to the proposed method is the idea of capturing the geometrical and topological information from the trajectory of the handwritten character using the X-graph and the Y-graph. For feature size reduction, the Haar wavelet transformation was applied on the graphs. For classification, the coefficient of determination [...] from the two-dimensional unreplicated linear functional relationship model is proposed as a similarity measure. The proposed methods show strong discrimination power when handling problems related to size, position and slant variation, stroke shape deformation, close resemblance of characters, and non-normalization. The proposed recognition system is applied to a database with 3000 frequently used Chinese characters, yielding a high recognition rate of 97.4% with reduced processing time of 75.31%, 73.05%, 58.27% and 40.69% when compared with recognition systems using the city block distance with deviation (CBDD), the minimum distance (MD), the compound Mahalanobis function (CMF) and the modified quadratic discriminant function (MQDF), respectively. High precision rates were also achieved.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.