Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 5

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
The unemployment rate is considered to be one of the essential characteristics of the state of the economy. Unemployment duration can also describe the situation in the labour market. There are two sources of data on the duration of unemployment in the Czech Republic – data from the Labour Force Sample Survey provided by the Czech Statistical Office (aggregated or individual data) and aggregated data from the database of registered unemployed people held by labour offices under the Ministry of Labour and Social Affairs. Two parametric lognormal distribution is used to model the distribution of durations quarterly from 1Q 2000 to 2Q 2019. The maximum likelihood estimates of parameters are found from individual data taking into account censored (incomplete observations) when observing unemployment duration; the minimum chi-squared method is used to estimate parameters from aggregated data. Time series of estimated parameters from different data sources, estimation procedures and data types are presented and compared. The relationship between the rate of unemployment and the duration of unemployment is shown.
EN
Dirichlet's principle, also known as a pigeonhole principle, claims that if n item are put into m containers, with n > m, then there is a container that contains more than one item. In this work, we focus rather on an inverse Dirichlet's principle (by switching items and containers), which is as follows: considering n items put in m containers, when n < m, then there is at least one container with no item inside. Furthermore, we refine Dirichlet's principle using discrete combinatorics within a probabilistic framework. Applying stochastic fashion on the principle, we derive the number of items n may be even greater than or equal to m, still very likely having one container without an item. The inverse definition of the problem rather than the original one may have some practical applications, particularly considering derived effective upper bound estimates for the items number, as demonstrated using some applied mini-studies.
EN
Comparison of two time-event survival curves representing two groups of individuals' evolution in time is relatively usual in applied biostatistics. Although the log-rank test is the suggested tool how to face the above-mentioned problem, there is a rich statistical toolbox used to overcome some of the properties of the log-rank test. However, all of these methods are limited by relatively rigorous statistical assumptions. In this study, we introduce a new robust method for comparing two time-event survival curves. We briefly discuss selected issues of the robustness of the log-rank test and analyse a bit more some of the properties and mostly asymptotic time complexity of the proposed method. The new method models individual time-event survival curves in a discrete combinatorial way as orthogonal monotonic paths, which enables direct estimation of the p-value as it was originally defined. We also gently investigate how the surface of an area, bounded by two survival curves plotted onto a plane chart, is related to the test’s p-value. Finally, using simulated time-event data, we check the robustness of the introduced method in comparison with the log-rank test. Based on the theoretical analysis and simulations, the introduced method seems to be a promising and valid alternative to the log-rank test, particularly in case on how to compare two time-event curves regardless of any statistical assumptions.
EN
The log-rank test and Cox’s proportional hazard model can be used to compare survival curves but are limited by strict statistical assumptions. In this study, we introduce a novel, assumption-free method based on a random forest algorithm able to compare two or more survival curves. A proportion of the random forest’s trees with sufficient complexity is close to the test’s p-value estimate. The pruning of trees in the model modifies trees’ complexity and, thus, both the method’s robustness and statistical power. The discussed results are confirmed using a simulation study, varying the survival curves and the tree pruning level.
EN
An exhaustive selection of all possible combinations of n = 400 from N = 698 observations of the COVID-19 dataset was used as a benchmark. Building a random set of subsamples and choosing the one that minimized an averaged sum of squares of each variable's category frequency returned similar results as a "forward" subselection reducing the dataset one-by-one observation by the same metric's permanent lowering. That works similarly as k-means clustering (with a random clusters' number) over the original dataset's observations and choosing a subsample from each cluster proportionally to its size. However, the approaches differ significantly in asymptotic time complexity.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.