In recent years, with the expansion of information, artificial intelligence technology has been developed and used in various fields. Among them, optical neural network provides a new type of special neural network accelerator chip solution, which has the advantages of high speed, high bandwidth, and low power consumption. In this paper, we construct an optical neural network based on Mach–Zehnder interferometer. The experimental results on the image classification of MNIST handwritten digitals show that the optical neural network has high accuracy, fast convergence and good scalability.
This paper presents a machine learning and image segmentation based advanced quality assessment technique for thin Refill Friction Stir Spot Welded (RFSSW) joints. In particular, the research focuses on developing a predictive support vector machines (SVM) model. The purpose of this model is to facilitate the selection of RFSSW process parameters in order to increase the shear load capacity of joints. In addition, an improved weld quality assessment algorithm based on optical analysis was developed. The research methodology includes specimen preparation stages, mechanical tests, and algorithmic analysis, culminating in a machine learning model trained on experimental data. The results demonstrate the effectiveness of the model in selecting welding process parameters and assessing weld quality, offering significant improvements compared to standard techniques. This research not only proposes a novel approach to optimizing welding parameters but also facilitates automatic quality assessment, potentially revolutionizing and spreading the application of the RFSSW technique in various industries
Leaf diseases may harm plants in different ways, often causing reduced productivity and, at times, lethal consequences. Detecting such diseases in a timely manner can help plant owners take effective remedial measures. Deficiencies of vital elements such as nitrogen, microbial infections and other similar disorders can often have visible effects, such as the yellowing of leaves in Catharanthus roseus (bright eyes) and scorched leaves in Fragaria ×ananassa (strawberry) plants. In this work, we explore approaches to use computer vision techniques to help plant owners identify such leaf disorders in their plants automatically and conveniently. This research designs three machine learning systems, namely a vanilla CNN model, a CNN-SVM hybrid model, and a MobileNetV2-based transfer learning model that detect yellowed and scorched leaves in Catharanthus roseus and strawberry plants, respectively, using images captured by mobile phones. In our experiments, the models yield a very promising accuracy on a dataset having around 4000 images. Of the three models, the transfer learning-based one demonstrates the highest accuracy (97.35% on test set) in our experiments. Furthermore, an Android application is developed that uses this model to allow end-users to conveniently monitor the condition of their plants in real time.
Data augmentation is a popular approach to overcome the insufficiency of training data for medical imaging. Classical augmentation is based on modification (rotations, shears, brightness changes, etc.) of the images from the original dataset. Another possible approach is the usage of Generative Adversarial Networks (GAN). This work is a continuation of the previous research where we trained StyleGAN2-ADA by Nvidia on the limited COVID-19 chest X-ray image dataset. In this paper, we study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. Two datasets are considered, one with 1000 images per class (4000 images in total) and the second with 500 images per class (2000 images in total). We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. We compare the quality of the GAN-based augmentation approach to two different approaches (classical augmentation and no augmentation at all) by employing transfer learning-based classification of COVID-19 chest X-ray images. The results are quantified using different classification quality metrics and compared to the results from the previous article and literature. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets. The correlation between the size of the original dataset and the quality of classification is visible independently from the augmentation approach.
Melanoma skin cancer is one of the most dangerous and life-threatening cancer. Exposure to ultraviolet rays may damage the skin cell's DNA, which can causes melanoma skin cancer. However, detecting and classifying melanoma and nevus moles at their immature stages is difficult. In this work, an automatic deep-learning system has been developed based on intensity value estimation with a convolutional neural network model (CNN) for detecting and classifying melanoma and nevus moles more accurately. Since intensity levels are the most distinctive features for identifying objects or regions of interest, high-intensity pixel values have been selected from extracted lesion images. Incorporating those high-intensity features into CNN improves the overall performance of the proposed model than the state-of-the-art methods for detecting melanoma skin cancer. To evaluate the system, we used five-fold cross-validation. The experimental results showed that superior percentages of accuracy (92.58%), sensitivity (93.76%), specificity (91.56%), and precision (90.68%) were achieved.
6
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Chronic obstructive pulmonary disease (COPD) is a complex and multi-component respiratory disease. Computed tomography (CT) images can characterize lesions in COPD patients, but the image intensity and morphology of lung components have not been fully exploited. Two datasets (Dataset 1 and 2) comprising a total of 561 subjects were obtained from two centers. A multiple instance learning (MIL) method is proposed for COPD identification. First, randomly selected slices (instances) from CT scans and multi-view 2D snapshots of the 3D airway tree and lung field extracted from CT images are acquired. Then, three attention-guided MIL models (slice-CT, snapshot-airway, and snapshot-lung-field models) are trained. In these models, a deep convolution neural network (CNN) is utilized for feature extraction. Finally, the outputs of the above three MIL models are combined using logistic regression to produce the final prediction. For Dataset 1, the accuracy of the slice-CT MIL model with 20 instances was 88.1%. The backbone of VGG-16 outperformed Alexnet, Resnet18, Resnet26, and Mobilenet_v2 in feature extraction. The snapshotairway and snapshot-lung-field MIL models achieved accuracies of 89.4% and 90.0%, respectively. After the three models were combined, the accuracy reached 95.8%. The proposed model outperformed several state-of-the-art methods and afforded an accuracy of 83.1% for the external dataset (Dataset 2). The proposed weakly supervised MIL method is feasible for COPD identification. The effective CNN module and attention-guided MIL pooling module contribute to performance enhancement. The morphology information of the airway and lung field is beneficial for identifying COPD.
Soil is a solid particle that covers the surface of the earth. Soil can be classified based on its color because the color indicates the nature and condition of the soil. CNN works well for image classification, but it requires large amounts of data. Augmentation is a technique to increase the amount of training data with various transformation techniques to the existing data. Rotation and Gamma Correction can be used simply as an augmentation technique and can reproduce an image with as many image variations as desired from the original image. CNN architecture has a convolution layer and Dense block has dense layers. The addition of Dense blocks to CNN aims to overcome underfitting and overfitting problems. This study proposes a combination of Augmentation and classification. In augmentation, a combination of rotation and Gamma correction techniques is used to reproduce image data. The CNN-Dense block is applied for classification. The soil image classification is grouped based on 5 labels black soil, cinder soil, laterite soil, peat soil, and yellow soil. The performances of the proposed method provide excellent results, where accuracy, precision, recall, and F1-Score performances are above 90%. It can be concluded that the combination of rotation and Gamma Correction as augmentation techniques and CNN-Dense blocks is powerful for use in soil image classification.
W artykule podjęto temat uczenia maszynowego w rozpoznawaniu obiektów topograficznych na zdjęciach lotniczych i satelitarnych VHR ze szczególnym uwzględnieniem Bazy Danych Obiektów Topograficznych BDOT10k. Celem prac badawczych było przetestowanie trzech algorytmów klasyfikacji nadzorowanej do automatycznej detekcji wybranych klas obiektów topograficznych, m.in.: budynków, betonowych oraz asfaltowych elementów szarej infrastruktury (drogi, chodniki, place), wód powierzchniowych, lasów, terenów zadrzewionych i zakrzewionych, terenów o niskiej roślinności oraz gleby odkrytej (grunty nieużytkowane, wyrobiska). Przeanalizowano trzy powszechnie stosowane klasyfikatory: Maximum Likelihood, Support Vector Machine oraz Random Trees pod kątem różnych parametrów wejściowych. Wynikiem przeprowadzonych badań jest ocena ich skuteczności w detekcji poszczególnych klas oraz ocena przydatności wyników klasyfikacji do aktualizacji bazy danych BDOT10k. Badania zostały przeprowadzone dla zdjęcia satelitarnego WorldView-2 o rozdzielczości przestrzennej 0,46 m oraz ortofotomapy ze zdjęć lotniczych o dokładności przestrzennej 0,08 m. Wyniki badań wskazują na to, że wykorzystanie różnych klasyfikatorów uczenia maszynowego oraz danych źródłowych wpływa nieznacznie na wynik klasyfikacji. Ogólne statystyki dokładności wskazują, że całościowo klasyfikacja z wykorzystaniem zdjęć satelitarnych dała nieco lepsze rezultaty o kilka punktów procentowych w granicach 76-81%, a dla zdjęć lotniczych 75-78%. Natomiast dla niektórych klas miara statystyczna F1 przekracza wartość 0,9. Testowane algorytmy uczenia maszynowego dają bardzo dobre rezultaty w identyfikacji wybranych obiektów topograficznych, ale nie można jeszcze mówić o bezpośredniej aktualizacji BDOT10k.
EN
The article deals with the topic of machine learning (ML) in the recognition of topographic objects in aerial and satellite VHR image, with particular emphasis on the Topographic Objects Database (BDOT10k). The aim of the research work was to test three supervised classification algorithms for automatic detection of selected classes of topographic objects, including: buildings, concrete and asphalt elements of grey infrastructure (roads, pavements, squares), surface waters, forests, wooded and bushy areas, areas with low vegetation and uncovered soil (unused lands or excavations). Three commonly used classifiers were analysed: Maximum Likelihood, Support Vector Machine and Random Trees for different input parameters. The result of the research is the assessment of their effectiveness in the detection of individual classes and the assessment of the suitability of the classification results for updating the BDOT10k database. The research was carried out for the WorldView-2 satellite image with a spatial resolution of 0.46 m and orthophotos from aerial images with a spatial resolution of 0.08 m. The research results indicate that the use of different ML classifiers and source data slightly affects the classification result. Overall accuracy statistics show that the classification using satellite images gave slightly better results by a few percentage points in the range from 76% to 81%, and for aerial photos from 75% to 78%. However, for some classes the statistical measure F1 exceeds 0.9 value. The tested ML algorithms give very good results in identifying selected topographic objects, but it is not yet possible to directly update topographical database.
9
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Diabetes Mellitus (DM) belongs to the ten diseases group with the highest mortality rate globally, with an estimated 578 million cases by 2030, according to the World Health Organization (WHO). The disease manifests itself through different disorders, where vasculopathy shows a chronic relationship with diabetic ulceration events in distal extremities, being temperature a biomarker that can quantify the risk scale. According to the above, an analysis is performed with standing thermography images, finding temperature patterns that do not follow a particular distribution in patients with DM. Therefore, the modern medical literature has taken the use of Computer-Aided Diagnosis (CAD) systems as a plausible option to increase medical analysis capabilities. In this sense, we proposed to study three state-of-the-art deep learning (DL) architectures, experimenting with convolutional, residual, and attention (Transformers) approaches to classify subjects with DM from diabetic foot thermography images. The models were trained under three conditions of data augmentation. A novel method based on modifying the images through the change of the amplitude in the Fourier Transform is proposed, being the first work to perform such synergy in the characterization of risk in ulcers through thermographies. The results show that the proposed method allowed reaching the highest values, reaching a perfect classification through the convolutional neural network ResNet50v2, promising for limited data sets in thermal pattern classification problems.
10
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
TheCOVID-19 epidemic has been causing a global problem since December 2019.COVID-19 is highly contagious and spreads rapidly throughout the world. Thus, early detection is essential. The progression of COVID-19 lung illness has been demonstrated to be aided by chest imaging. The respiratory system is the most vulnerable component of the human body to the COVID virus. COVID can be diagnosed promptly and accurately using images from a chest X-ray and a computed tomography scan. CT scans are preferred over X-rays to rule out other pulmonary illnesses, assist venous entry, and pinpoint any new heart problems. The traditional and trending tools are physical, time-inefficient, and not more accurate. Many techniques for detecting COVID utilizing CT scan images have recently been developed, yet none of them can efficiently detect COVID at an early stage. We proposed a two-dimensional Flexible analytical wavelet transform (FAWT) based on a novel technique in this work. This method is decomposed pre-processed images into sub-bands. Then statistical-based relevant features are extracted, and principal component analysis (PCA) is used to identify robust features. After that, robust features are ranked with the help of the Student’s t-value algorithm. Finally, features are applied to Least Square-SVM (RBF) for classification. According to the experimental outcomes, our model beat state-of-the-art approaches for COVID classification. This model attained better classification accuracy of 93.47%, specificity 93.34%, sensitivity 93.6% and F1-score 0.93 using tenfold cross-validation.
Manual toll collection systems are obsolete due to time, fuel, and pollution issues and need to be replaced by new and better alternatives. Traditionally, governments have always employed people to collect toll, but the manual labor isn’t much effective when it comes to monitoring and efficiency. We took this problem and researched out an effective solution i.e., “Electronic Toll Collector Framework” which is a framework mainly for collection and monitoring of the toll fees collected by the toll plazas in the vicinity of metropolitan cities like Lahore or Karachi. The software can generate toll tax based on vehicle type. Additionally, it can also generate daily/monthly/yearly revenue reports. The framework can serve other purposes like monitoring of vehicles (by the law enforcement agencies) and generation of analytics. It can also serve as a backbone for the government departments who are having a hard time monitoring the revenue generated by the employers. There are two operational modes of the framework (partly manual and automatic). The partly manual approach uses TensorFlow backend, and the automatic approach uses Yolov2 backend. This work will be helpful in guiding future research and practical work in this domain.
Artificial neural networks (ANN) are the most commonly used algorithms for image classification problems. An image classifier takes an image or video as input and classifies it into one of the possible categories that it was trained to identify. They are applied in various areas such as security, defense, healthcare, biology, forensics, communication, etc. There is no need to create one’s own ANN because there are several pre-trained networks already available. The aim of the SHREC projects (automatic ship recognition and identification) is to classify and identify the vessels based on images obtained from closed-circuit television (CCTV) cameras. For this purpose, a dataset of vessel images was collected during 2018, 2019, and 2020 video measurement campaigns. The authors of this article used three pre-trained neural networks, GoogLeNet, AlexNet, and SqeezeNet, to examine the classification possibility and assess its quality. About 8000 vessel images were used, which were categorized into seven categories: barge, special-purpose service ships, motor yachts with a motorboat, passenger ships, sailing yachts, kayaks, and others. A comparison of the results using neural networks to classify floating inland units is presented.
The aim of the research is to compare traditional and deep learning methods in image classification tasks. The conducted research experiment covers the analysis of five different models of neural networks: two models of multi–layer perceptron architecture: MLP with two hidden layers, MLP with three hidden layers; and three models of convolutional architecture: the three VGG blocks model, AlexNet and GoogLeNet. The models were tested on two different datasets: CIFAR–10 and MNIST and have been applied to the task of image classification. They were tested for classification performance, training speed, and the effect of the complexity of the dataset on the training outcome.
PL
Celem badań jest porównanie metod klasycznego i głębokiego uczenia w zadaniach klasyfikacji obrazów. Przeprowa-dzony eksperyment badawczy obejmuje analizę pięciu różnych modeli sieci neuronowych: dwóch modeli wielowar-stwowej architektury perceptronowej: MLP z dwiema warstwami ukrytymi, MLP z trzema warstwami ukrytymi; oraz trzy modele architektury konwolucyjnej: model z trzema VGG blokami, AlexNet i GoogLeNet. Modele przetrenowano na dwóch różnych zbiorach danych: CIFAR–10 i MNIST i zastosowano w zadaniu klasyfikacji obrazów. Zostały one zbadane pod kątem wydajności klasyfikacji, szybkości trenowania i wpływu złożoności zbioru danych na wynik trenowania.
The Marina area represents an official new gateway of entry to Egypt and the development of infrastructure is proceeding rapidly in this region. The objective of this research is to obtain building data by means of automated extraction from Pléiades satellite images. This is due to the need for efficient mapping and updating of geodatabases for urban planning and touristic development. It compares the performance of random forest algorithm to other classifiers like maximum likelihood, support vector machines, and backpropagation neural networks over the well-organized buildings which appeared in the satellite images. Images were subsequently classified into two classes: buildings and non-buildings. In addition, basic morphological operations such as opening and closing were used to enhance the smoothness and connectedness of the classified imagery. The overall accuracy for random forest, maximum likelihood, support vector machines, and backpropagation were 97%, 95%, 93% and 92% respectively. It was found that random forest was the best option, followed by maximum likelihood, while the least effective was the backpropagation neural network. The completeness and correctness of the detected buildings were evaluated. Experiments confirmed that the four classification methods can effectively and accurately detect 100% of buildings from very high-resolution images. It is encouraged to use machine learning algorithms for object detection and extraction from very high-resolution images.
This article presents an analysis of the possibilities of using the pre-degraded GoogLeNet artificial neural network to classify inland vessels. Inland water authorities monitor the intensity of the vessels via CCTV. Such classification seems to be an improvement in their statutory tasks. The automatic classification of the inland vessels from video recording is a one of the main objectives of the Automatic Ship Recognition and Identification (SHREC) project. The image repository for the training purposes consists about 6,000 images of different categories of the vessels. Some images were gathered from internet websites, and some were collected by the project’s video cameras. The GoogLeNet network was trained and tested using 11 variants. These variants assumed modifications of image sets representing (e.g., change in the number of classes, change of class types, initial reconstruction of images, removal of images of insufficient quality). The final result of the classification quality was 83.6%. The newly obtained neural network can be an extension and a component of a comprehensive geoinformatics system for vessel recognition.
16
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Głębokie uczenie jest podkategorią uczenia maszynowego, które polega na tworzeniu wielowarstwowych sieci neuronowych, naśladując tym samym wykonywanie zadań przez ludzki mózg. Algorytmy głębokiego uczenia są ułożone według rosnącej złożoności, dlatego możliwe jest stworzenie systemów do analizy dużych zbiorów danych. Proces uczenia odbywa się bez nadzoru, a program buduje samodzielnie zestaw cech do rozpoznania. Artykuł przybliża na czym polega owa klasyfikacja obrazu tomograficznego.
EN
Deep learning is a subcategory of machine learning, which involves the creation of multilayer neural networks, mimicking the performance of tasks by the human brain. Deep learning algorithms are arranged according to increasing complexity, so it is possible to create systems to analyze large data sets. The learning process takes place unsupervised, and the program builds a set of features to recognize. The article presents the classification of the tomographic image.
The article considers the main criteria for the selection and formation of the wardrobe, which is one of the areas of application of methods and means for image classification. Typical software solutions for the task are analyzed, and the Analytic Hierarchy Process was used to analyze such applications. To improve the wardrobe selection process, the concept of an intelligent information system based on the use of convolutional neural networks was proposed.
18
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Quite a common problem during training the classifier is a small number of samples in the training database, which can significantly affect the obtained results. To increase them, data augmentation can be used, which generates new samples based on existing ones, most often using simple transformations. In this paper, we propose a new approach to generate such samples using image processing techniques and discrete interpolation method. The described technique creates a new image sample using at least two others in the same class. To verify the proposed approach, we performed tests using different architectures of convolution neural networks for the ship classification problem.
19
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Segmentation is the key computer vision task in modern medicine applications. Instance segmentation became the prevalent way to improve segmentation performance in recent years. This work proposes a novel way to design an instance segmentation model that combines 3 semantic segmentation models dedicated for foreground, boundary and centroid predictions. It contains no detector so it is orthogonal to a standard instance segmentation design and can be used to improve the performance of a standard design. The presented custom designed model is verified on the Gland Segmentation in Colon Histology Images dataset.
20
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
This work proposes a fog computing-based system for face mask detection that controls the entry of a person into a facility. The proposed system uses fog nodes to process the video streams captured at various entrances into a facility. Haar-cascade-classifiers are used to detect face portions in the video frames. Each fog node deploys two MobileNet models, where the first model deals with the dichotomy between mask and no mask case. The second model deals with the dichotomy between proper mask wear and improper mask wear case and is applied only if the first model detects mask in the facial image. This two-level classification allows the entry of people into a facility, only if they wear the mask properly. The proposed system offers performance benefits such as improved response time and bandwidth consumption, as the processing of video stream is done locally at each fog gateway without relying on the Internet.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.