Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 209

Liczba wyników na stronie
first rewind previous Strona / 11 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  eksploracja danych
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 11 next fast forward last
EN
The paper presents the application of machine learning techniques in estimating the temperature of asphalt layers during measurements using FWD (Falling Weight Deflectometer) and TSD devices (Traffic Speed Deflection). The problem of accurate determination of temperature is crucial for analysing the durability of road pavements. Traditional methods such as the BELLS3 model, although widely used, have limitations in forecast accuracy. The work presents the implementation of advanced algorithms such as multivariate adaptive regression spline (MARS), support vector machines (SVM), artificial neural networks (ANN), random forest (RF) and boosted trees (BT), among others, to optimise a model for estimating the temperature of asphalt layers Td. The BELLS3 model, used as the baseline in the optimisation process, was evaluated for prediction effectiveness. The results showed moderate effectiveness of this model (R2 = 82%, RMSE = 2.3°C), which triggered a need for further improvements. The use of machine learning techniques, particularly boosted gradient trees (BTs), has made it possible to significantly improve the precision of predictions. The BT model achieved the greatest fit for the dependent variable Td (R2 = 99% and RMSE = 0.61°C), indicating its clear advantage over other models, including the baseline BELLS3 model. Finally, the authors highlight the potential of integrating traditional approaches with advanced data analysis methods to further improve the accuracy of forecasting bituminous mixture layer temperature and effective management of road infrastructure.
PL
W pracy przedstawiono zastosowanie technik uczenia maszynowego w szacowaniu temperatury warstw asfaltowych podczas pomiarów urządzeniami FWD i TSD. Problem precyzyjnego określenia temperatury jest kluczowy dla analizy trwałości nawierzchni drogowych. Tradycyjne metody, takie jak model BELLS3, choć szeroko stosowane, mają ograniczenia w dokładności prognoz. Praca prezentuje implementację zaawansowanych algorytmów, między innymi takich jak regresja adaptacyjna (MARS), wektory nośne (SVM), sieci neuronowe (ANN), drzewa losowe (RF) i drzewa wzmacniane (BT), w celu optymalizacji modelu szacowania temperatury warstw asfaltowych Td. Model BELLS3, wykorzystany jako bazowy w procesie optymalizacji, został oceniony pod kątem skuteczności predykcji. Wyniki wykazały umiarkowaną skuteczność tego modelu (R2 = 82%, RMSE = 2,3°C), co stanowiło potrzebę dalszych udoskonaleń. Zastosowanie technik uczenia maszynowego, w szczególności wzmacnianych drzew gradientowych (BT), pozwoliło na znaczne zwiększenie precyzji prognoz. Model BT osiągnął najwyższe dopasowanie do zmiennej zależnej Td (R2 = 99% oraz RMSE = 0,61°C), co wskazuje na jego wyraźną przewagę nad innymi modelami, w tym nad bazowym modelem BELLS3. Na koniec autorzy podkreślają potencjał integracji tradycyjnych podejść z zaawansowanymi metodami analizy danych w celu dalszej poprawy dokładności prognozowania temperatur warstw mieszanek mineralno-asfaltowych i efektywnego zarządzania infrastrukturą drogową.
EN
In this paper, we propose a new hybrid approach, which combines Generalized Normal Distribution Optimization Algorithm (GNDOA) and fuzzy C-Means clustering (FCM). It is designed for processing unsuperviseddatasets. This idea target list the development about conventional function option and clustering techniques. The proposed GNDOA-FCM uses normalized normal distribution concept along with FCM for more accurate and efficient clustering outputs leading to accelerated detection in survey region. Calinski-Harabasz index helps finding the number of clusters that has high compactness within each cluster and also apart from other clusters. The performance of the proposed hybrid GNDOA-FCM approach is tested extensively using different benchmark datasets. The results are compared with existing clustering methods using evaluation metrics like silhouette score & feature selection accuracy. Experimental results show that the proposed method can be flexibly set to obtain higher quality of clustering and is more effective than conventional techniques.
PL
W niniejszym artykule proponujemy nowe podejście hybrydowe, które łączy algorytm uogólnionej optymalizacji rozkładu normalnego (GNDOA) i klasteryzację rozmytych C-średnich(FCM). Zostało ono zaprojektowane do przetwarzania nienadzorowanych zbiorów danych. Pomysł ten ma na celu rozwój konwencjonalnych opcji funkcji i technik klasteryzacji. Proponowany GNDOA-FCMwykorzystuje koncepcję znormalizowanego rozkładu normalnego wraz z FCM w celu uzyskania dokładniejszych i wydajniejszych wyników klasteryzacji, co prowadzi do przyspieszenia wykrywania w badanym regionie. Wskaźnik Calińskiego-Harabasza pomaga znaleźć liczbę klastrów, które charakteryzują się wysoką zwartością w obrębie każdego klastra, a także w odniesieniu do innych klastrów. Wydajność proponowanego hybrydowego podejścia GNDOA-FCM została dokładnie przetestowana przy użyciu różnych zestawów danych benchmarkowych. Wyniki porównano z istniejącymi metodami klastrowania przy użyciu wskaźników oceny, takich jak wynik sylwetki i dokładność wyboru cech. Wyniki eksperymentów pokazują, że proponowana metoda może być elastycznie dostosowana w celu uzyskania wyższej jakości klastrowania i jest bardziej skuteczna niż konwencjonalne techniki.
EN
Today, traffic accidents are still a difficult and urgent problem for many countries around the world. Traffic accidents on highways are often more serious than accidents on urban roads. Therefore, disseminating emergency information and creating immediate connections with road users is key to rescuing passengers and reducing congestion. Thus, this study applies data fusion and data mining techniques to analyze travel time and valuable information about traffic accidents based on the real-time data collected from On-Board Unit installed in vehicles. The results show that this important information is the vital database to analyze traffic conditions and safety factors, thereby developing a smart traffic information platform. This result enables traffic managers to provide real-time traffic information or forecasts of congestion and traffic accidents to road users. This helps limit congestion and serious accidents on the Highway.
4
Content available A study of big data in cloud computing
EN
Over the last two decades, the size and amount of data has increased enormously, whichhas changed traditional methods of data management and introduced two new technolog-ical terms: big data and cloud computing. Addressing big data, characterized by massivevolume, high velocity and variety, is quite challenging as it requires large computationalinfrastructure to store, process and analyze it. A reliable technique to carry out sophisti-cated and enormous data processing has emerged in the form of cloud computing becauseit eliminates the need to manage advanced hardware and software, and offers various ser-vices to users. Presently, big data and cloud computing are gaining significant interestamong academia as well as in industrial research. In this review, we introduce variouscharacteristics, applications and challenges of big data and cloud computing. We providea brief overview of different platforms that are available to handle big data, including theircritical analysis based on different parameters. We also discuss the correlation betweenbig data and cloud computing. We focus on the life cycle of big data and its vital analysisapplications in various fields and domains At the end, we present the open research issuesthat still need to be addressed and give some pointers to future scholars in the fields ofbig data and cloud computing.
EN
This research has established an energy consumption prediction model based on the Monte Carlo method to resolve the energy-saving transformation problem. First, simplify the building to construct the proposed model. Second, through the principle of building energy balance and Monte Carlo method, the cooling and heat demand model of regional buildings and the energy consumption prediction model of regional buildings are built. Finally, the energy consumption simulation and energy consumption prediction of the regional building complex after energy-saving renovation are carried out. The experiment shows that the building energy consumption in July and August was relatively high, reaching 2.36E+14 and 2.4E+14, respectively. The energy consumption in April and November was relatively low, reaching 1.2E+14 and 1.4E+14, respectively. The highest prediction error was in November, reaching 12%. The lowest prediction error was in January and February, only about 2%. The error of monthly energy consumption predicted by Monte Carlo method is less than 12%, the Root-mean-square deviation is 5%, and the error between predicted and actual annual total energy consumption is only about 2%. By comparing the predicted energy consumption after energy-saving renovation with before, the energy-saving rate reached about 20%. The research results indicate that the proposed Monte Carlo based predictive stochastic model exhibits good predictive performance in building energy-saving renovation, providing theoretical guidance and reference for feasibility studies, planning, prediction, decision-making, and optimization of building energy-saving renovation.
EN
Distresses are integral parts of pavement that occur during the life of the road. Bitumen distress is known as one of the most important problems of Iran's roads, especially in tropical areas and transit routes with heavy axes; so, identifying the effective factors in creating the bleeding phenomenon is very necessary and important. Therefore, this study was conducted to investigate the parameters of the mixing design in creation of bleeding phenomenon and its severity. The collected data were then analyzed and grouped using Design Expert and SPSS software. The results show that all five parameters of optimal bitumen percent, bitumen percent in asphalt mixture, void percent of Marshall Sample, percent void and filler to bitumen ratio are effective on bitumen and its intensity. Among the mentioned parameters, two parameters of percent of bitumen compared to asphalt mixture and the void percent in the Marshall sample have a greater effect on the severity of the bleeding phenomenon.
7
Content available remote Eksploracja danych. Analiza dużych zbiorów danych
PL
Analiza danych jest ciągle rozwijającym się procesem, składającym się z wielu etapów. W artykule przedstawiony został główny etap – eksplorowanie danych. Jak wybrać odpowiednią metodę eksploracji danych?
EN
An electricity theft is a problem for distribution system operators (DSOs) in Poland. DSOs use many ways to limit this unfavourable phenomenon. Within this paper, the author presents a new method to detect the location of illegal power consumption. The method bases on the processing of data from an advanced metering infrastructure (AMI). The method is based on the observation that some consumers illegally consume energy mainly in the winter season and that the level of illegal energy consumption may depend on the level of energy consumption. The method searches for periods of temporary reduction of the balance difference and a simultaneous decrease in energy consumption by one of the consumers.
PL
Kradzieże energii elektrycznej są problemem dla operatorów systemów dystrybucyjnych (OSD) W Polsce. OSD stosują wiele sposobów ograniczania tego niekorzystnego zjawiska. W artykule autor przedstawia nową metodę wykrywania miejsc nielegalnego poboru energii. Metoda opiera się na przetwarzaniu danych z zaawansowanej infiastruktury pomiarowej (AMI). Metoda bazuje na spostrzeżeniu, że część odbiorców nielegalnie pobiera energię głównie W sezonie zimowym oraz że poziom nielegalnego poboru energii może zależeć od poziomu zużycia energii. Metoda poszukuje okresów czasowego zmniejszenia różnicy bilansowej i występującego W tym samym czasie spadku zużycia energii przez jednego z odbiorców.
EN
Performing data mining tasks in the medical domain poses a significant challenge, mainly due to the uncertainty present in patients' data, such as incompleteness or missingness. In this paper, we focus on the data mining task of clustering corticosteroid (CS) responsiveness in sepsis patients. We address the issue and challenge of missing data by applying Game-Theoretic Rough Sets (GTRS) as a three-way decision approach. Our study considers the APROCCHS cohort, comprising 1240 sepsis patients, provided by the Assistance Publique--Hôpitaux de Paris (AP-HP), France. Our experimental results on the APROCCHS cohort indicate that GTRS maintains the trade-off between accuracy and generality, demonstrating its effectiveness even when increasing the number of missing values.
EN
Purpose: The aim of the article is to describe and forecast possible difficulties related to the development of cognitive technologies and the progressing of algorithmization of HRM processes as a part of Industry 4.0. Design/methodology/approach: While most of the studies to date related to the phenomenon of Industry 4.0 and Big Data are concerned with the level of efficiency of cyber-physical systems and the improvement of algorithmic tools, this study proposes a different perspective. It is an attempt to foresee the possible difficulties connected with algorithmization HRM processes, which understanding could help to "prepare" or even eliminate the harmful effects we may face which will affect decisions made in the field of the managing organizations, especially regarding human resources management, in era of Industry 4.0. Findings: The research of cognitive technologies in the broadest sense is primarily associated with a focus of thinking on their effectiveness, which can result in a one-sided view and ultimately a lack of objective assessment of that effectiveness. Therefore, conducting a parallel critical reflection seems even necessary. This reflection has the potential to lead to a more balanced assessment of what is undoubtedly "for", but also of what may be "against". The proposed point of view may contribute to a more informed use of algorithm-based cognitive technologies in the human resource management process, and thus to improve their real-world effectiveness. Social implications: The article can have an educational function, helps to develop critical thinking about cognitive technologies, and directs attention to areas of knowledge by which future skills should be extended. Originality/value: This article is addressed to all those who use algorithms and data-driven decision-making processes in HRM. Crucial in these considerations is the to draw attention to the dangers of unreflective use of technical solutions supporting HRM processes. The novelty of the proposed approach is the identification of three potential risk areas that may result in faulty HR decisions. These include the risk of "technological proof of equity", overconfidence in the objective character of algorithms and the existence of a real danger resulting from the so-called algorithm overfitting. Recognition of these difficulties ultimately contributed to real improvements in productivity by combining human performance with technology effectiveness.
EN
The aluminum profile extrusion process is briefly characterized in the paper, together with the presentation of historical, automatically recorded data. The initial selection of the important, widely understood, process parameters was made using statistical methods such as correlation analysis for continuous and categorical (discrete) variables and ‘inverse’ ANOVA and Kruskal–Wallis methods. These selected process variables were used as inputs for MLP-type neural models with two main product defects as the numerical outputs with values 0 and 1. A multi-variant development program was applied for the neural networks and the best neural models were utilized for finding the characteristic influence of the process parameters on the product quality. The final result of the research is the basis of a recommendation system for the significant process parameters that uses a combination of information from previous cases and neural models.
EN
Approximately 30 million tons of tailings are being stored each year at the KGHMs Zelazny Most Tailings Storage Facility (TSF). Covering an area of almost 1.6 thousand hectares, and being surrounded by dams of a total length of 14 km and height of over 70 m in some areas, makes it the largest reservoir of post-flotation tailings in Europe and the second-largest in the world. With approximately 2900 monitoring instruments and measuring points surrounding the facility, Zelazny Most is a subject of round-the-clock monitoring, which for safety and economic reasons is crucial not only for the immediate surroundings of the facility but for the entire region. The monitoring network can be divided into four main groups: (a) geotechnical, consisting mostly of inclinometers and VW pore pressure transducers, (b) hydrological with piezometers and water level gauges, (c) geodetic survey with laser and GPS measurements, as well as surface and in-depth benchmarks, (d) seismic network, consisting primarily of accelerometer stations. Separately a variety of different chemical analyses are conducted, in parallel with spigotting processes and relief wells monitorin. This leads to a large amount of data that is difficult to analyze with conventional methods. In this article, we discuss a machine learning-driven approach which should improve the quality of the monitoring and maintenance of such facilities. Overview of the main algorithms developed to determine the stability parameters or classification of tailings are presented. The concepts described in this article will be further developed in the IlluMINEation project (H2020).
PL
W składowisku odpadów poflotacyjnych KGHM Żelazny Most składuje się rocznie około 30 milionów ton odpadów przeróbczych. Zajmujący powierzchnię prawie 1,6 tys. ha i otoczony zaporami o łącznej długości 14 km i wysokości na niektórych obszarach ponad 70 m, czyni go największym zbiornikiem odpadów poflotacyjnych w Europie i drugim co do wielkości na świecie. Z około 2900 urządzeniami monitorującymi i punktami pomiarowymi otaczającymi obiekt, Żelazny Most jest przedmiotem całodobowego monitoringu, co ze względów bezpieczeństwa i ekonomicznych ma kluczowe znaczenie nie tylko dla najbliższego otoczenia obiektu, ale dla całego regionu. Sieć monitoringu można podzielić na cztery główne grupy: (a) geotechniczna, składająca się głównie z inklinometrów i przetworników ciśnienia porowego VW, (b) hydrologiczna z piezometrami i miernikami poziomu wody, (c) geodezyjne z pomiarami laserowymi i GPS oraz jako repery powierzchniowe i gruntowe, (d) sieć sejsmiczna, składająca się głównie ze stacji akcelerometrów. Oddzielnie przeprowadza się szereg różnych analiz chemicznych, równolegle z procesami spigotingu i monitorowaniem studni odciążających. Prowadzi to do dużej ilości danych, które są trudne do analizy konwencjonalnymi metodami. W tym artykule omawiamy podejście oparte na uczeniu maszynowym, które powinno poprawić jakość monitorowania i utrzymania takich obiektów. Przedstawiono przegląd głównych algorytmów opracowanych do wyznaczania parametrów stateczności lub klasyfikacji odpadów. Do analizy i klasyfikacji odpadów wykorzystano pomiary z testów CPTU. Klasyfikacja gruntów naturalnych z wykorzystaniem badan CPT jest powszechnie stosowana, nowością jest zastosowanie podobnej metody do klasyfikacji odpadów na przykładzie zbiornika poflotacyjnego. Analiza eksploracyjna pozwoliła na wskazanie najistotniejszych parametrów dla modelu. Do klasyfikacji wykorzystano wybrane modele uczenia maszynowego: k najbliższych sąsiadów, SVM, RBF SVM, drzewo decyzyjne, las losowy, sieci neuronowe, QDA, które porównano w celu wytypowania najskuteczniejszego. Koncepcje opisane w tym artykule będą dalej rozwijane w projekcie IlluMINEation (H2020).
13
Content available remote Using gradient boosting trees to predict the costs of forwarding contracts
EN
When selling goods abroad or bringing them into the country from foreign partners, we face the problem of delivery. The division of responsibilities related to this between the manufacturer and the recipient sometimes varies. In such a situation, it is reasonable to use the services of a forwarding company. Then a forwarding contract is concluded, which specifies the details of the service, but the most important issue remains the selection of its price. In this paper, we present results obtained using the LightGBM method on the forwarding contracts pricing challenge held as part of the FedCSIS 2022 conference.
14
Content available remote Rough sets turn 40: from information systems to intelligent systems
EN
The theory of rough sets was founded by Zdzisław Pawlak to serve as a framework for data and knowledge exploration. Following Professor Pawlak's seminal paper titled "Rough Sets'' published in 1982 in International Journal of Computer and Information Sciences, it is important to discuss the history, the presence and possible future developments of this theory, as well as its applications. One of the key aspects that lets us use rough sets in practical scenarios is the notion of information system, which in fact comes from even earlier Professor Pawlak's works. Information systems are the means for data and knowledge representation. They constitute the input to rough set mechanisms aimed at computing concept approximations and deriving compacted and interpretable decision models. Accordingly, in this paper we discuss where information systems come from. We claim that in many applications it is not enough to treat a data set - represented as an information system - as a purely mathematical object with no linkage to the data origins. Quite oppositely, in practice we may need to work with information systems more actively, giving ourselves a technical possibility to construct them dynamically, taking into account interaction with physical environments where the data is created.
EN
We discuss the international competition FedCSIS 2022 Challenge: Predicting the Costs of Forwarding Contracts that was organized in association with the FedCSIS conference series at the KnowledgePit platform. We explain the scope and outline the results obtained by the most successful teams.
EN
In our previous work we presented a framework for mining spatio-temporal rules in the software development process. The rules are based on specific relations between structures of the source code which relate both to spatial (e.g. a direct call between methods of two classes) and temporal dependencies (e.g. one class introduced into the source code before the other) observed in the process. To some extent, spatio-temporal rules allow us to predict where and when certain design anti-patterns will appear in the source code of a software system. This paper presents how, with slight modifications, such framework can be used to improve the quality of detecting a few popular design anti-patterns, such as Blob, Swiss Army Knife, YoYo or Brain Class. In the proposed method, we not only check the structure of a piece of the source code, but we also analyse its spatio-temporal relations. Only on the basis of the two analyses can we decide if the given piece of code is an anti-pattern. Experimental validation shows that the addition of spatio-temporal perspective improves detection of anti-patterns by 4% in terms of F-measure.
PL
Rozpatrywany jest problem wyznaczania rekomendacji na podstawie wskazanych przykładów decyzji akceptowalnych i przykładów decyzji nieakceptowalnych. Wskazanie przez decydenta tych przykładów jest podstawą oceny jego preferencji. Istota przedstawionego rozwiązania polega na określeniu preferencji jako klastra wyznaczonego poprzez uzupełnianie wskazanych przykładów. W artykule zaproponowano procedurę kolejnych przybliżeń bazującą na rozwiązaniach zadania klasyfikacji na podstawie zadanych przykładów.
EN
The problem of determining a decision recommendation according to examples of acceptable decisions and examples of unacceptable decisions indicated by the decision-maker is considered in the paper. The decision-maker's examples are the foundation for assessing his preferences. The essence of the presented solution consists in determining the preferences of the decision-maker as a cluster designated by supplementing the indicated examples. The paper proposes a procedure of successive approximations based on the classification task according to given examples.
PL
W artykule przedstawiono zakłócenia spotykane w pracy przekrawacza rotacyjnego. Skupiono się na tematyce uszkodzeń podzespołów elektrycznych, jak np. enkoder. Zaprezentowano również zagadnienia teoretyczne dotyczące systemów diagnostycznych, opartych na systemach sztucznej inteligencji – sieci neuronowe. Omówiono prostą metodę diagnostyczną, wykorzystującą statystykę w aplikacji tekturnicy.
EN
The article presents the disturbances encountered in the operation of a rotary sheeter, and focuses on damage to electrical components, such as an encoder. Theoretical issues of diagnostic systems based on artificial intelligence systems – neural networks are also presented. A simple diagnostic method was presented, based on statistics in the corrugator application.
EN
Purpose: Diabetes is a chronic disease that pays for a large proportion of the nation's healthcare expenses when people with diabetes want medical care continuously. Several complications will occur if the polymer disorder is not treated and unrecognizable. The prescribed condition leads to a diagnostic center and a doctor's intention. One of the real-world subjects essential is to find the first phase of the polytechnic. In this work, basically a survey that has been analyzed in several parameters within the poly-infected disorder diagnosis. It resembles the classification algorithms of data collection that plays an important role in the data collection method. Automation of polygenic disorder analysis, as well as another machine learning algorithm. Design/methodology/approach: This paper provides extensive surveys of different analogies which have been used for the analysis of medical data, For the purpose of early detection of polygenic disorder. This paper takes into consideration methods such as J48, CART, SVMs and KNN square, this paper also conducts a formal surveying of all the studies, and provides a conclusion at the end. Findings: This surveying has been analyzed on several parameters within the poly-infected disorder diagnosis. It resembles that the classification algorithms of data collection plays an important role in the data collection method in Automation of polygenic disorder analysis, as well as another machine learning algorithm. Practical implications: This paper will help future researchers in the field of Healthcare, specifically in the domain of diabetes, to understand differences between classification algorithms. Originality/value: This paper will help in comparing machine learning algorithms by going through results and selecting the appropriate approach based on requirements.
EN
Big data, artificial intelligence and the Internet of things (IoT) are still very popular areas in current research and industrial applications. Processing massive amounts of data generated by the IoT and stored in distributed space is not a straightforward task and may cause many problems. During the last few decades, scientists have proposed many interesting approaches to extract information and discover knowledge from data collected in database systems or other sources. We observe a permanent development of machine learning algorithms that support each phase of the data mining process, ensuring achievement of better results than before. Rough set theory (RST) delivers a formal insight into information, knowledge, data reduction, uncertainty, and missing values. This formalism, formulated in the 1980s and developed by several researches, can serve as a theoretical basis and practical background for dealing with ambiguities, data reduction, building ontologies, etc. Moreover, as a mature theory, it has evolved into numerous extensions and has been transformed through various incarnations, which have enriched expressiveness and applicability of the related tools. The main aim of this article is to present an overview of selected applications of RST in big data analysis and processing. Thousands of publications on rough sets have been contributed; therefore, we focus on papers published in the last few years. The applications of RST are considered from two main perspectives: direct use of the RST concepts and tools, and jointly with other approaches, i.e., fuzzy sets, probabilistic concepts, and deep learning. The latter hybrid idea seems to be very promising for developing new methods and related tools as well as extensions of the application area.
first rewind previous Strona / 11 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.