The aim of the research was to prove the methods of the cluster analysis, which have already been used by a number of scientific disciplines, are usable also in tourism. The cluster analysis is a mathematical statistical method which allows a set of objects of input data matrix to be divided into several clusters. Measures of distance are used to evaluate similarity of the objects. Euclidean distance can be used for quantitative variables. Tourism potential has been studied by a number of authors here and abroad, see bibliography. A district (Czech: okres) has been chosen as basic spatial unit in our research. The advantage of this approach is the possibility of using a public database of CZSO (Czech Statistical Office) which is the most important source of input data. Furthermore the data published by the Institute for Spatial Development in Brno have been used, the tourism potential of individual districts has been specified by the methods of cluster analysis, and the districts have been divided into six groups. Conformity of these groups with reality proves the appropriateness of the method used.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The discovery of knowledge in the case of Hierarchical Cluster Analysis (HCA) depends on many factors, such as the clustering algorithms applied and the strategies developed in the initial stage of Cluster Analysis. We present a global approach for evaluating the quality of clustering results and making a comparison among different clustering algorithms using the relevant information available (e.g. the stability, isolation and homogeneity of the clusters). In addition, we present a visual method to facilitate evaluation of the quality of the partitions, allowing identification of the similarities and differences between partitions, as well as the behaviour of the elements in the partitions. We illustrate our approach using a complex and heterogeneous dataset (real horse data) taken from the literature. We apply HCA based on the generalized affinity coefficient (similarity coefficient) to the case of complex data (symbolic data), combined with 26 (classic and probabilistic) clustering algorithms. Finally, we discuss the obtained results and the contribution of this approach to gaining better knowledge of the structure of data.
This paper aims to investigate the patterns of gender inequalities in the Information and Communication (ICT) sector in European Union (EU) countries. Based on secondary data from Eurostat, a cluster analysis has been conducted to identify clusters of EU countries with various patterns of dependencies among the gender pay gap, female entrepreneurship, and employment in the ICT sector. Three clusters of EU countries have been identified with different patterns of the situation as regards women in this sector. In countries belonging to the first cluster, a higher level of gender pay gap coexists with the lowest share of female participation in the ICT sector and features the choice of entrepreneurship rather than employment. In countries of the second cluster, the lowest gender pay gap is observed together with an increase in female employment in the ICT sector as compared to the countries in the first cluster, and a higher share of employed women than entrepreneurs. In the countries of the third cluster, the moderate gender pay gap found therein is associated with the highest share of female ICT entrepreneurs, and is higher than the share of employed professionals. The discovery of the various patterns of the co-existence of the gender pay gap and women’s participation in the ICT sector reveal that the pay gap is rather the factor preventing women from entering this sector, as there is limited potential to push them towards entrepreneurship instead of paid employment. The authors’ results contribute to the theory of entrepreneurship and gender studies by investigating gender gaps in entrepreneurship and wages in the ICT sector as a primary sector.
4
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The aim of this study was to find correlations between several studied elements and analyzed materials as well as the application and validation of an analytical method to determine trace elements in hair, fingernails and toenails of healthy volunteers (normal concentration). The method developed covers washing, mineralization and ICP-MS determination of 10 elements (Ca, Cd, Co, Cr, Cu, Fe, Mg, Ni, Pb and Zn) in hair and nails. Concentrations of the selected elements in hair, fingernails and toenails were measured for 24 women and 18 men. Furthermore, a chemometric approach (Principal Component Analysis, PCA) was employed to evaluate the correlations between concentrations of the elements in hair and nails and between these materials. Until now PCA has not been frequently applied in handling and interpretation of the results of analysis of biological materials. However, the results of the present investigation show the high potential of PCA in extraction of valuable information from analytical measurements. Additionally, PCA has become a useful tool for visualization of the obtained results. Moreover, the cluster analysis (CA) was used to group the samples according to gender, taking into account two different groups of elements: essential and toxic. [...]
The 4M method aims to determine the type of equilibrated ordinary chondrite only on the basis of the Mössbauer spectrum of the investigated meteorite. Mössbauer spectrum of non-weathered ordinary chondrite is comprised of four sub-spectra: two doublets and two sextets. One of the doublets consists of a signal from iron present in olivine and the other consists of a signal from iron present in pyroxene. Sextets on the other hand, contain signals from magnetically ordered iron. One sextet is related to the metallic phase (kamacite, taenite), whereas the second is related to troilite. A third of doublets, which emerges in weathered ordinary chondrites, is related to products of the oxidation of iron present in metallic phase. The spectral areas of olivine, pyroxenes, metallic phase and troilite, which were obtained from Mössbauer spectrum are proportional to the number of iron atoms present in relevant mineral phases. Some Mössbauer groups were inspired by this fact to construct different methods to determine the type (H, L, or LL) of investigated meteorites (Gałązka-Friedman et al. 2019, Hyp. Inter. 241(1)). However, these methods, based on subjective criteria, were only qualitative. Our group elaborated a quantitative method, which is based on objective criteria. We called it the "4M method" (where M are derived from meteorites, Mössbauer spectroscopy, multidimensional discriminant analysis (MDA), Mahalanobis distance) (Woźniak et al. 2019). This method was using only the Mössbauer experimental data, to which it applied advanced statistical methods. The base, which was created from Mössbauer data, consisted of three clusters H, L, LL. These clusters were constructed with sets of 4-dimensional vectors. The vectors are comprised of spectral areas of Mössbauer spectrum: ol (value proportional to iron present in olivine), pyr (value proportional to iron present in pyroxene), met (value proportional to iron present in metallic phase), tr (value proportional to iron present in troilite). To determine the type of investigated ordinary chondrite, its ol, pyr, met and tr values with average values of variables obtained for clusters H, L and LL need to be compared. The comparison can be performed with the use of MDA and Mahalanobis distance. Once Mahalanobis distance of the investigated meteoriteis is known, the level of similarity to three types of ordinary chondrites can be calculated. Examples of such calculations were performed for seven ordinary chondrites: Goronyo, Carancas, New Concord, NWA 7733, Leoncin, Sołtmany and Pułtusk. They were made with the use of the new base composed of 62 non-weathered ordinary chondrites. All results obtained with the 4M method yielded results consistent with traditional mineralogical methods.
Příspěvek popisuje výsledky výzkumu, v jehož rámci byly hledány typické skupiny studentů, které se objevují při realizaci výuky formou e-learningu. Využita byla shluková analýza, pomocí níž bylo zjištěno, že se vyskytuje pět charakteristických skupin studentů, které se odlišují zejména svým způsobem komunikace s tutorem.
EN
This paper describes the results of research in which they were searched typical groups of students which appear in the implementation of teaching through e-learning. Cluster analysis was used. Was found that there are five characteristic groups of students, which is different especially in its own way communication with a tutor.
Przedstawiona tutaj pozycja wydawnicza jest obszernym wprowadzeniem do najważniejszych podstawowych zasad, algorytmów i danych wraz zestrukturami, do których te zasady i algorytmy się odnoszą. Przedstawione zaganienia są wstępem do rozważań w dziedzinie informatyki. Jednakże, to algorytmy są podstawą analityki danych i punktem skupienia tego podręcznika. Pozyskiwanie wiedzy z danych wymaga wykorzystania metod i rezultatów z co najmniej trzech dziedzin: matematyki, statystyki i informatyki. Książka zawiera jasne i intuicyjne objaśnienia matematyczne i statystyczne poszczególnych zagadnień, przez co algorytmy są naturalne i przejrzyste. Praktyka analizy danych wymaga jednak więcej niż tylko dobrych podstaw naukowych, ścisłości matematycznej i spojrzenia od strony metodologii statystycznej. Zagadnienia generujące dane są ogromnie zmienne, a dopasowanie metod pozyskiwania wiedzy może być przeprowadzone tylko w najbardziej podstawowych algorytmach. Niezbędna jest płynność programowania i doświadczenie z rzeczywistymi problemami. Czytelnik jest prowadzony przez zagadnienia algorytmiczne z wykorzystaniem Pythona i R na bazie rzeczywistych problemów i analiz danych generowanych przez te zagadnienia. Znaczną część materiału zawartego w książce mogą przyswoić również osoby bez znajomości zaawansowanej metodologii. To powoduje, że książka może być przewodnikiem w jedno lub dwusemestralnym kursie analityki danych dla studentów wyższych lat studiów matematyki, statystyki i informatyki. Ponieważ wymagana wiedza wstępna nie jest zbyt obszerna, studenci po kursie z probabilistyki lub statystyki, ze znajomością podstaw algebry i analizy matematycznej oraz po kurs programowania nie będą mieć problemów, tekst doskonale nadaje się także do samodzielnego studiowania przez absolwentów kierunków ścisłych. Podstawowy materiał jest dobrze ilustrowany obszernymi zagadnieniami zaczerpniętymi z rzeczywistych problemów. Skojarzona z książką strona internetowa wspiera czytelnika danymi wykorzystanymi w książce, a także prezentacją wybranych fragmentów wykładu. Jestem przekonany, że tematem książki jest nowa dziedzina nauki.
EN
The book under review gives a comprehensive presentation of data science algorithms, which means on practical data analytics unites fundamental principles, algorithms, and data. Algorithms are the keystone of data analytics and the focal point of this textbook. The data science, as the authors claim, is the discipline since 2001. However, informally it worked before that date (cf. Cleveland(2001)). The crucial role had the graphic presentation of the data as the visualization of the knowledge hidden in the data. It is the discipline which covers the data mining as the tool or important topic. The escalating demand for insights into big data requires a fundamentally new approach to architecture, tools, and practices. It is why the term data science is useful. It underscores the centrality of data in the investigation because they store of potential value in the field of action. The label science invokes certain very real concepts within it, like the notion of public knowledge and peer review. This point of view makes that the data science is not a new idea. It is part of a continuum of serious thinking dates back hundreds of years. The good example of results of data science is the Benford law (see Arno Berger and Theodore P. Hill(2015, 2017). In an effort to identifying some of the best-known algorithms that have been widely used in the data mining community, the IEEE International Conference on Data Mining (ICDM) has identified the top 10 algorithms in data mining for presentation at ICDM '06 in Hong Kong. This panel will announce the top 10 algorithms and discuss the impact and further research of each of these 10 algorithms in 2006. In the present book, there are clear and intuitive explanations of the mathematical and statistical foundations make the algorithms transparent. Most of the algorithms announced by IEEE in 2006 are included. But practical data analytics requires more than just the foundations. Problems and data are enormously variable and only the most elementary of algorithms can be used without modification. Programming fluency and experience with real and challenging data are indispensable and so the reader is immersed in Python and R and real data analysis. By the end of the book, the reader will have gained the ability to adapt algorithms to new problems and carry out innovative analysis.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.