This paper presents a novel approach to federated learning based on the Smooth Ordered Weighted Averaging (OWA) operator which enables flexible and context-sensitive weighting of local models during the aggregation process. To enhance the precision of the aggregated weight computations, we incorporate numerical quadrature-inspired techniques, allowing for a more accurate representation of individual client contributions to the global model. Specifically, the approach utilizes classical OWA and several smoothed variants derived from Newton-Cotes quadratures, including the 3/8 rule, trapezoidal rule, and ONC4 (4-point open Newton-Cotes) formula. The study compares federated learning models using standard weight averaging against those incorporating both classical and smoothed OWA operators. This evaluation provides insight into how the smoothing mechanisms influence aggregation quality and final model accuracy. A neural network comprising several dense layers served as the classification model in the Federated Learning framework. Two experimental scenarios were considered: one where data was evenly distributed across local clients, and another with non-uniform data distribution to reflect real-world heterogeneity. Various strategies for extracting the OWA weights were explored, including performance-based weighting determined by the accuracy of local models during preliminary training rounds The proposed methodology has been tested on small-scale image datasets such as MNIST and it has demonstrated improved classification accuracy value compared to traditional Federated Learning approaches using simple averaging.
Federated learning is a machine learning technique that enables models to learn while preserving user privacy. In this approach, multiple institutions collaborate to develop a shared model without exchanging raw data. Instead, they share only the model’s generated weights. In this article, a novel method for weight aggregation is proposed, based on weighted averages and entropy, within the framework of horizontal federated learning. The aggregation process begins by generating predictions on a validation set. Then, entropy is calculated for the weights from each client, reflecting the uncertainty or variability in their contributions. Finally, a weighted average is applied, and the previously computed entropies are used to determine the influence of each client’s weights in the final model. The proposed algorithm has been evaluated on several datasets and compared against widely used methods such as FedAvg, FedProx, and FedOpt. The results indicate that the new approach increased mean accuracy by about 2 percentage points compared to FedAvg. The most significant improvement was observed on the Iris dataset, where accuracy increased by about 6 percentage points.
In the age of digital transformation, cyber security has become a paramount concern for organizations worldwide. Traditional Intrusion Detection Systems (IDS) face limitations in adaptability and scalability, particularly when processing vast amounts of data from diverse sources. This article explores the application of Federated Learning (FL) in enhancing IDS performance based Convolutional neural network, presenting a novel approach that leverages decentralized data processing while preserving data privacy. By allowing multiple nodes to collaboratively learn a shared model without exchanging raw data, Federated Learning addresses the privacy and security concerns associated with centralized IDS. The study evaluates the effectiveness of the proposed FL-based IDS using UNSW-NB15 datasets under both independent and identically distributed (IID) and non-independent and non-identically distributed (Non-IID) data distributions, demonstrating improved detection accuracy and robustness against sophisticated cyber threats. The results underscore the potential of Federated Learning to revolutionize intrusion detection, offering a scalable, efficient, and privacy-preserving solution for modern cybersecurity challenges.
PL
W dobie transformacji cyfrowej cyber bezpieczeństwo stało się kwestią najwyższej wagi dla organizacji na całym świecie. Tradycyjne systemy wykrywania włamań (IDS) napotykają ograniczenia w zakresie adaptowalności i skalowalności, szczególnie podczas przetwarzania ogromnych ilości danych z różnych źródeł. W tym artykule zbadano zastosowanie Federated Learning (FL) w zwiększaniu wydajności IDS, prezentując nowatorskie podejście wykorzystujące zdecentralizowane przetwarzanie danych przy jednoczesnym zachowaniu prywatności danych. Umożliwiając wielu węzłom wspólne uczenie się współdzielonego modelu bez wymiany surowych danych, Federated Learning rozwiązuje problemy związane z prywatnością i bezpieczeństwem związane z scentralizowanym IDS. Badanie ocenia skuteczność proponowanego opartego na FL IDS przy użyciu zestawów danych UNSW-NB15, wykazując zwiększoną dokładność wykrywania i odporność na wyrafinowane cyberzagrożenia. Wyniki podkreślają potencjał Federated Learning w zakresie zrewolucjonizowania wykrywania włamań, oferując skalowalne, wydajne i chroniące prywatność rozwiązanie dla współczesnych wyzwań cyberbezpieczeństwa.
This study introduces an innovative interval-valued fuzzy inference system (IFIS) integrated with federated learning (FL) to enhance posture detection, with a particular emphasis on fall detection for the elderly. Our methodology significantly advances the accuracy of fall detection systems by addressing key challenges in existing technologies, such as false alarms and data privacy concerns. Through the implementation of FL, our model evolves collaboratively over time while maintaining the confidentiality of individual data, thereby safeguarding user privacy. The application of interval-valued fuzzy sets to manage uncertainty effectively captures the subtle variations in human behavior, leading to a reduction in false positives and an overall increase in system reliability. Furthermore, the rule-based system is thoroughly explained, highlighting its correlation with system performance and the management of data uncertainty, which is crucial in many medical contexts. This research offers a scalable, more accurate, and privacy-preserving solution that holds significant potential for widespread adoption in healthcare and assisted living settings. The impact of our system is substantial, promising to reduce the incidence of fall-related injuries among the elderly, thereby enhancing the standard of care and quality of life. Additionally, our findings pave the way for future advancements in the application of federated learning and fuzzy inference in various fields where privacy and precision in uncertain environments are of paramount importance.
Big data-driven intelligent fault diagnosis methods for device rely on a large amount of labeled data for centralized training. However, in practical engineering, it is difficult for a single client to collect enough labeled sample data, which is one of the reasons that limit the application of these methods. In fact, multiple clients often use similar devices and collect fault data separately, so joint multi-client collaborative fault diagnosis modeling can solve the problem of data scarcity, but this poses great challenges to data privacy protection. In this paper, we propose a federated transfer fault diagnosis method based on federated learning for cross-domain incomplete data. The proposed method only exchanges the parameters of the local training model, which achieves the privacy protection of the client’s local data. We construct a multi-client collaborative learning framework to address the problem of weak generalization ability caused by the lack of terms in single client training samples. We also propose a targeted semi-supervised fine-tuning strategy based on relative distance to reduce the probability of negative fine-tuning of out-of-distribution samples and improve the accuracy of diagnostic models. The results of cross-condition and cross-equipment experiments demonstrate that the proposed method has obvious advantages over the existing fault diagnosis methods.
The main objective of the planned effort is to provide analytical analyses of current intrusion detection systems grounded on ML algorithms. Furthermore, examined in this work are the useful data sets and several techniques already in use to develop an effective IDS using single, hybrid, and ensemble machine learning algorithms. The approaches in the literature have then been investi-gated under several criteria to provide a clear road and direction for the next projects that will be successful. Nowadays, companies of all kinds include an intrusion detection system (IDS), which inhibits cybercrime to protect the network, resources, and private data. Many strategies have been suggested and implemented up till now to prevent uncivil behaviour. Since machine learning (ML) approaches are successful, the proposed approach applied several ML models for the intrusion detection system. The CIC IoT 2023 Dataset is the one applied in this paper, and a two-step process for Intrusion detection was proposed. Tested with several techniques including random forest, XGBoost, logistic regression, MLP model, and RNN. Following fine-tuning, the federated learning model using neural networks had the best accuracy—99.84%.
Nowadays, two technological trends, Federated Learning (FL) and Edge Computing (EC), are increasingly important and influential. FL is a decentralized machine learning strategy that allows learning on distributed data. It primarily allows performing learning operations close to the user, where the data is gathered. This approach belongs to the EC domain, where the main goal is to move computation closer to the end user (e.g., from the centralized cloud). In our work, we apply the FL and EC in the context of network flow classification. We achieved an accuracy of 0.957 with the FL model, compared to 0.924 for the best local model. We achieved these results thanks to the federated averaging performed on neural network layers. To verify our approach, we executed allour experiments on a virtualized environment that emulates existing mid-scale EC network infrastructure, including limitations related to resource constraints on edge nodes.
Wykrywanie zajętości widma jest kluczowym zagadnieniem umożliwiającym dynamiczny dostęp do widma. Współcześnie w celu polepszenia detekcji popularne są rozwiązania z obszaru uczenia maszynowego, w tym uczenia federacyjnego (FL). Głównym wyzwaniem w tym kontekście jest ograniczony dostęp do danych treningowych. W pracy przedstawiono podejście rozproszone FL, skupiając się na węzłach pozbawionych dostępu do danych uczących. Omówiono wyniki eksperymentu sprzętowego polegającego na wykrywaniu sygnału DVB-T.
EN
Spectrum occupancy detection is a key enabler for dynamic spectrum access, where machine learning algorithms are successfully utilized for detection improvement. However, the main challenge is limited access to labeled data about users’ transmission presence needed in supervised learning models. We present a distributed federated learning approach that addresses this challenge for sensors without access to learning data. The paper discusses the results of the conducted hardware experiment, where FL has been applied for DVB-T signal detection.
Over the last decade, the use of Automatic emotion recognition has become increasingly widespread in response to the growing need to improve human life quality. The used emotion data encompasses a wealth of personal information, which includes but is not limited to gender, age, health condition, identity, and so on. These demographic information, known as soft or hard biometrics, are private and the user may not share them with others. Unfortunately, with the adversarial algorithms, this information can be inferred automatically, creating the potential for user’s data breach. To address the above issues, we present a federated learning–based approach to hide identity-related information in identity subject task, while maintaining their effectiveness for emotion utility task. We also introduce differential privacy mechanism, a method that explicitly limits the data leakage from federated learning model. Experiments conducted on the WESAD dataset demonstrate that stress recognition tasks can be effectively carried out while decreasing user identity and ensuring differential privacy guarantees; the intensity of the amount of noise derived from differential privacy can be tuned to balance the trade-off between privacy and utility.
PL
W ciągu ostatniej dekady zastosowanie automatycznego rozpoznawania emocji stało się coraz bardziej powszechne w odpowiedzi na rosnącą potrzebę poprawy jakości życia człowieka. Wykorzystywane dane dotyczące emocji obejmują bogactwo danych osobowych, które obejmują między innymi płeć, wiek, stan zdrowia, tożsamość itd. Te informacje demograficzne, zwane miękkimi lub twardymi danymi biometrycznymi, są prywatne i użytkownik nie może udostępniać ich innym osobom. Niestety, w przypadku algorytmów kontradyktoryjnych informacje te mogą zostać wywnioskowane automatycznie, co stwarza ryzyko naruszenia bezpieczeństwa danych użytkownika. Aby rozwiązać powyższe problemy, przedstawiamy stowarzyszone podejście oparte na uczeniu się, mające na celu ukrycie informacji związanych z tożsamością w zadaniu podmiotu tożsamości, przy jednoczesnym zachowaniu ich skuteczności w zadaniu użyteczności emocjonalnej. Wprowadzamy także mechanizm różnicowej prywatności, metodę, która wyraźnie ogranicza wyciek danych z federacyjnego modelu uczenia się. Eksperymenty przeprowadzone na zbiorze danych WESAD pokazują, że zadania rozpoznawania stresu można skutecznie wykonywać, zmniejszając jednocześnie tożsamość użytkownika i zapewniając zróżnicowane gwarancje prywatności; intensywność hałasu pochodzącego z różnicowej prywatności można dostroić, aby zrównoważyć kompromis między prywatnością a użytecznością.
Federated learning (FL) involves joint model training by various devices while preserving the privacy of their data. However, it presents a challenge of dealing with heterogeneous data located on participating devices. This issue can further be complicated by the appearance of malicious clients, aiming to sabotage the training process by poisoning local data. In this context, a problem of differentiating between poisoned and non-identically-independently-distributed (non-IID) data appears. To address it, a technique utilizing data-free synthetic data generation is proposed, using a reverse concept of adversarial attack. Adversarial inputs allow for improving the training process by measuring clients’ coherence and favoring trustworthy participants. Experimental results, obtained from the image classification tasks for MNIST, EMNIST, and CIFAR-10 datasets are reported and analyzed.
As the computational and communicational capabilities of edge and IoT devices grow, so do the opportunities for novel Machine Learning solutions. This leads to an increase in popularity of Federated Learning (FL), especially in cross-device settings. However, while there is a multitude of ongoing research works analyzing various aspects of the FL process, most of them do not focus on issues of operationalization and monitoring. For instance, there is a noticeable lack of research in the topic of effective problem diagnosis in FL systems. This work begins with a case study, in which we have intended to compare the performance of four selected approaches to the topology of FL systems. For this purpose, we have constructed and executed simulations of their training process in a controlled environment. We have analyzed the obtained results and encountered concerning periodic drops in the accuracy for some of the scenarios. We have performed a successful reexamination of the experiments, which led us to diagnose the problem as caused by exploding gradients. In view of those findings, we have formulated a potential new method for the continuous monitoring of the FL training process. The method would hinge on regular local computation of a handpicked metric - the gradient scale coefficient (GSC). We then extend our prior research to include a preliminary analysis of the effectiveness of GSC and average gradients per layer as potentially suitable for FL diagnostics metrics. In order to perform a more thorough examination of their usefulness in different FL scenarios, we simulate the occurrence of the exploding gradient problem, vanishing gradient problem and stable gradient serving as a baseline. We then evaluate the resulting visualizations based on their clarity and computational requirements. We introduce a gradient monitoring suite for the FL training process based on our results.
Federated learning is an upcoming concept used widely in distributed machine learning. Federated learning (FL) allows a large number of users to learn a single machine learning model together while the training data is stored on individual user devices. Nonetheless, federated learning lessens threats to data privacy. Based on iterative model averaging, our study suggests a feasible technique for the federated learning of deep networks with improved security and privacy. We also undertake a thorough empirical evaluation while taking various FL frameworks and averaging algorithms into consideration. Secure multi party computation, secure aggregation, and differential privacy are implemented to improve the security and privacy in a federated learning environment. In spite of advancements, concerns over privacy remain in FL, as the weights or parameters of a trained model may reveal private information about the data used for training. Our work demonstrates that FL can be prone to label-flipping attack and a novel method to prevent label-flipping attack has been proposed. We compare standard federated model aggregation and optimization methods, FedAvg and FedProx using benchmark data sets. Experiments are implemented in two different FL frameworks – Flower and PySyft and the results are analyzed. Our experiments confirm that classification accuracy increases in FL framework over a centralized model and the model performance is better after adding all the security and privacy algorithms. Our work has proved that deep learning models perform well in FL and also is secure.
13
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Citizen science has emerged as a valuable resource for scientific research, providing large volumes of data for training deep learning models. However, the quality and accuracy of crowd-sourced data pose significant challenges for supervised learning tasks such as plant trait detection. This study investigates the application of AI techniques to address these issues within natural science. We explore the potential of multi-modal data analysis and ensemble methods to improve the accuracy of plant trait classification using citizen science data. Additionally, we examine the effectiveness of transfer learning from authoritative datasets like PlantVillage to enhance model performance on open- access platforms such as iNaturalist. By analysing the strengths and limitations of AI-driven approaches in this context, we aim to contribute to developing robust and reliable methods for utilising citizen science data in natural science.
Porównano różne metody służące do zapewniania prywatności w przypadku przetwarzania danych z użyciem uczenia maszynowego. Wybrano najbardziej adekwatne metody: szyfrowanie homomorficzne, prywatność różnicowa, metoda uczenia federacyjnego. Efektywność przedstawionych algorytmów została ujęta ilościowo za pomocą powszechnie używanych metryk: funkcji kosztu dla jakości procesu uczenia, dokładności dla klasyfikacji i współczynnika determinacji dla regresji.
EN
Various methods for ensuring privacy in machine learning based data processing were compared. The most suitable methods have been selected: homomorphic encryption, differential privacy, and federated learning. The effectiveness of the presented algorithms was quantified using commonly used metrics: cost function for the quality of the learning process, accuracy for classification, and coefficient of determination for regression.
W niniejszym artykule przedstawiono wyniki badań i analizy wpływu ataków zatruwających odwracających etykiety (ang. label-flipping) na uczenie federacyjne w zastosowaniu dla detekcji zajętości zasobów radiowych. Badania przeprowadzono zarówno dla ataków skoordynowanych jak i losowych, przy zmiennym stosunku liczby użytkowników atakujących do liczby użytkowników uczciwych oraz różnym stopniu agresywności i czasie trwania ataków. Badania skupiają się na porównaniu skuteczności algorytmu detekcji zasobów radiowych przed i po przeprowadzonych atakach.
EN
This paper presents the research results and analysis of the impact of poisoning label-flipping attacks on federated learning for spectrum sensing. The experiments have been executed for random and coordinated attacks for varying attackers-to-genuine-users ratios, different levels of aggressiveness, and time duration of attacks. The results have been obtained by comparing spectrum sensing machine learning model performance with and without attacks.
16
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
While the emerging market of Generative Artificial Intelligence (AI) is increasingly dominated and controlled by the Tech Giants, there is also a growing interest in open-source AI code and models from smaller companies, research organisations and individual users. They often have valuable data that could be used for training, but their computing resources are limited, while data privacy concerns prevent them from sharing this data for public training. A possible solution to overcome these two issues is to utilise the crowd-souring principles and apply federated learning techniques to build a distributed privacy-preserving architecture for training Generative AI. This paper discusses how these two key enablers, together with some other emerging technologies, can be effectively combined to build a community-driven Generative AI ecosystem, allowing even small actors to participate in the training of Generative AI models by securely contributing their training data. The paper also discusses related non-technical issues, such as the role of the community and intellectual property rights, and outlines further research directions associated with AI moderation.
17
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Federated learning (FL) is a decentralized approach that aims at training a global model with the help of multiple devices, without collecting or revealing individual clients' data. The training of a federated model is conducted in communication rounds. Still, in certain scenarios, numerous communication rounds are impossible to perform. In such cases, a one-shot FL is utilized, where the number of communication rounds is limited to one. In this article, the idea of one-shot FL is enhanced with the usage of adversarial data, exploring and illustrating the possibilities to improve the performance of resulting global models, including scenarios with non-IID data, for image classification datasets: MNIST and CIFAR-10.
18
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Federated learning (FL) allows multiple devices to jointly train a global model without sharing local data. One of its problems is dealing with unbalanced data. Hence, a novel technique, designed to deal with label-skewed non-IID data, using adversarial inputs is proposed. Application of the proposed algorithm results in faster, and more stable, global model performance at the beginning of the training. It also delivers better final accuracy and decreases the discrepancy between the performance of individual classes. Experimental results, obtained for MNIST, EMNIST, and CIFAR-10 datasets, are reported and analyzed.
19
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
In the field of logistics, there is a significant shortage of qualified employees. Artificial Intelligence (AI) can help solve that problem supporting existing employees and reducing their workload. However, large amounts of data to train AI models are required and, in most cases, due to lack of trust between companies, model training is based solely on locally stored data from logistics providers and some publicly available datasets. To address this data scarcity issue, a proposed solution is to employ federated learning (FL), in the context of data trust (DT) by training AI models across multiple companies, based on both centralized data, within the DT platform and decentralized data from logistics providers data silos, while ensuring data sharing access at the attribute level. This paper proposes this approach and points out the importance of data sharing for effective model training for solving workforce challenges in logistics.
20
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The proliferation of digital artifacts with various computing capabilities, along with the emergence of edge computing, offers new possibilities for the development of Machine Learning solutions. These new possibilities have led to the popularity of Federated Learning (FL). While there are many existing works focusing on various aspects of the FL process, the issue of the effective problem diagnosis in FL systems remains largely unexplored. In this work, we have set out to artificially simulate the training process of four selected approaches to FL topology and compare their resulting performance. After noticing concerning disturbances throughout their training process, we have successfully identified their source as the problem of exploding gradients. We have then made modifications to the model structure and analyzed the new results. Finally, we have proposed continuous monitoring of the FL training process through the local computation of a selected metric.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.