Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 142

Liczba wyników na stronie
first rewind previous Strona / 8 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  computer vision
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 8 next fast forward last
PL
Dokonując przeglądu stanu wiedzy nt. modelowania informacji o budynku – BIM (ang. Building Information Modelling) można zauważyć, że technologia BIM nie poczyniła ostatnio znacznych postępów, ponieważ sztuczna inteligencja – AI (ang. Artificial Intelligence) nie jest jeszcze w pełni wykorzystana. Celem niniejszego artykułu jest zaprezentowanie możliwości wykorzystania sztucznej inteligencji – AI w modelowaniu BIM. Autorzy dokonali analizy trendów rozwoju sztucznej inteligencji, która jest obecnie wykorzystywana w modelowaniu BIM. W artykule przedstawiono również możliwości wykorzystania AI powiązanej z modelem BIM, a także omówiono wybrane przykłady wspomagania modelowania informacji o budynku z wykorzystaniem głównych czterech grup wybranych technik AI.
EN
When reviewing the state of knowledge on building information modeling (BIM), it can be noted that BIM technology has not made significant progress recently because artificial intelligence (AI) has not been fully used. The purpose of this article is to present the possibilities of using artificial intelligence – AI in BIM modeling. The authors analyzed the trends in the development of artificial intelligence, which is currently used in BIM modeling. The article also presents the possibilities of using AI related to the BIM model, and discusses selected examples of supporting building information modeling using the main four groups of selected AI techniques.
EN
We propose a Computer Vision and Machine Learning equipped model that secures the ATM from fraudulent activities by leveraging the use of Haar cascade (HRC) and Local Binary Pattern Histogram (LBPH) classifier for face detection and recognition correspondingly, which in turn detect fraud by utilizing features, like PIN and face recognition, help to identify and authenticate the user by checking with the trained dataset and trigger real-time alert mail if the user turns out to be unauthorized also. It does not allow them to log in into the machine, which resolves the ATM security issue. this system is evaluated on the dataset of real-world ATM camera feeds, which shows an accuracy of 90%. It can effectively detect many frauds, including identity theft and unauthorized access which makes it even more reliable.
EN
In this paper, we have researched implementing convolutional neural network (CNN) models for devices with limited resources, such as smartphones and embedded computers. To optimize the number of parameters of these models, we studied various popular methods that would allow them to operate more efficiently. Specifically, our research focused on the ResNet-101 and VGG-19 architectures, which we modified using techniques specific to model optimization. We aimed to determine which approach would work best for particular requirements for a maximum accepted accuracy drop. Our contribution lies in the comprehensive ablation study, which presents the impact of different approaches on the final results, specifically in terms of reducing model parameters, FLOPS, and the potential decline in accuracy. We explored the feasibility of implementing architecture compression methods that can influence the model’s structure. Additionally, we delved into post-training methods, such as pruning and quantization, at various model sparsity levels. This study builds upon our prior research to provide a more comprehensive understanding of the subject matter at hand.
EN
Ultrasound imaging is common for surgical training and development of medical robotics systems. Recent advancements in surgical training often utilize soft-tissue phantoms based on gelatin, with additional objects inserted to represent different, typically fluid-based pathologies. Segmenting these objects from the images is an important step in the development of training and robotic systems. The current study proposes a simple and fast algorithm for segmenting convex cyst-like structures from phantoms under very low training sample scenarios. The algorithm is based on a custom two-step thresholding procedure with additional post-processing with two trainable parameters. Two large phantoms with convex cysts are created and used to train the algorithm and evaluate its performance. The train/test procedure are repeated 60 times with different dataset splits and prove the viability of the solution with only 4 training images. The DICE coefficients were on average at 0.92, while in the best cases exceeded 0.95, all with fast performance in single-thread operation. The algorithm might be useful for development of surgical training systems and medical robotic systems in general.
EN
To assess the causes of failure of parts in operation, it is often necessary to assess the degradation of the structural and phase composition of the material and determine the cause of its change. Microhardness test is used to evaluate the mechanical properties of microvolumes of the material. Microhardness of structural components of steels and cast irons (armco iron ferrite, austenitic component of steel 12Х18Н10Т and cementite of centrifugally cast chrome-nickel cast iron (cast coating Ø910 mm)) was determined by restored four-sided pyramid impression with a square base and a top angle of 136±1. The paper evaluates the influence of the main factors on the micro-hardness error of ferritic, austenitic and carbide component of steels and cast irons: the amount and speed of the indenter load, the stiffness of the substrate, the field of distribution of plastic deformations around the impression, the quality of the surface preparation, the influence of grain boundaries and the relaxation of the impression shape over time. The main factors affecting the accuracy of measurements by the reconstructed impression method have been determined for each of the investigated phases: ferrite, austenite, and cementite.
PL
Aby ocenić przyczyny awarii części w eksploatacji, często konieczna jest ocena degradacji składu strukturalnego i fazowego materiału oraz określenie przyczyny jego zmiany. Do oceny właściwości mechanicznych mikroobjętości materiału stosuje się test mi-rotwardości. Mikrotwardość składników strukturalnych stali i żeliwa (ferryt żelaza armco, austenityczny składnik stali 12Х18Н10Т i cementyt odśrodkowo odlewanego żeliwa chromowo-niklowego (powłoka odlewu Ø910 mm)) określono przez przywrócony wycisk piramidy czterobocznej o podstawie kwadratowej i kącie wierzchołkowym 136±1. W pracy oceniono wpływ głównych czynników na błąd mikrotwardości ferrytycznego, austenitycznego i węglikowego składnika stali i żeliwa: wielkości i prędkości obciążenia wgłębnika, sztywności podłoża, pola rozkładu odkształceń plastycznych wokół wycisku, jakości przygotowania powierzchni, wpływu granic ziaren oraz relaksacji kształtu wycisku w czasie. Określono główne czynniki wpływające na dokładność pomiarów metodą zrekonstruowanego wycisku dla każdej z badanych faz: ferrytu, austenitu i cementytu.
EN
Oil is used for lubrication and cooling in every standard jet engine. Therefore, hydraulic installations are one of main parts of most of component test rigs and in some cases, they could be large and complicated. Removing sources of leakages is significant task for engineers and technicians. Oil leakages generate costs, reduce reliability of tests and are difficult to detect with use of classic sensors. This paper describes implementation of computer vision methods in the aviation component test laboratory. Three algorithms were proposed and successfully tested.
PL
Olej jest wykorzystywany do smarowania i chłodzenia w każdym silniku odrzutowym. Z tego względu instalacje olejowe s ˛a jednymi z głównych części stanowisk badawczych, a usuwanie przyczyn wycieków jest znaczącym zadaniem inżynierów i techników. Wycieki oleju generują koszty, ograniczają wiarygodność testów i są trudne do wykrycia przy pomocy klasycznych czujników pomiarowych. Dokument opisuje implementację metod widzenia maszynowego w lotniczych laboratoriach badawczych. W ramach prac zostały zaproponowane i przetestowane trzy algorytmy.
EN
Image quality assessment is a crucial task in various fields such as digital photography, online content creation, and automated quality control, as it ensures an optimal visual experience and aids in maintaining consistent standards. In this paper, we propose an efficient method for training image quality assessment models on the KonIQ-10ĸ dataset. Our novel approach utilizes a dual-Xception architecture that analyzes both the image content and additional image parameters, outperforming traditional single convolutional models. We introduce cross-sampling methods with random draw sampling of instances from majority classes, effectively enhancing prediction quality in the Mean Opinion Score(MOS) ranges that are underrepresented in the database. This methodology allows us to achieve near state-of-the-art results with limited computing costs and resources. Most importantly, our predictions across the entire spectrum of MOS values maintain consistent quality. Because of using a novel and highly effective method for image sampling, we achieved these results with much lower computational cost, making our approach the most effective way of MOS estimation on the KonIQ-10ĸ database.
EN
Open, broken, and improperly closed manholes can pose problems for autonomous vehicles and thus need to be included in obstacle avoidance and lane-changing algorithms. In this work, we propose and compare multiple approaches for manhole localization and classification like classical computer vision, convolutional neural networks like YOLOv3 and YOLOv3-Tiny, and vision transformers like YOLOS and ViT. These are analyzed for speed, computational complexity, and accuracy in order to determine the model that can be used with autonomous vehicles. In addition, we propose a size detection pipeline using classical computer vision to determine the size of the hole in an improperly closed manhole with respect to the manhole itself. The evaluation of the data showed that convolutional neural networks are currently better for this task, but vision transformers seem promising.
EN
Data augmentation is a popular approach to overcome the insufficiency of training data for medical imaging. Classical augmentation is based on modification (rotations, shears, brightness changes, etc.) of the images from the original dataset. Another possible approach is the usage of Generative Adversarial Networks (GAN). This work is a continuation of the previous research where we trained StyleGAN2-ADA by Nvidia on the limited COVID-19 chest X-ray image dataset. In this paper, we study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. Two datasets are considered, one with 1000 images per class (4000 images in total) and the second with 500 images per class (2000 images in total). We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. We compare the quality of the GAN-based augmentation approach to two different approaches (classical augmentation and no augmentation at all) by employing transfer learning-based classification of COVID-19 chest X-ray images. The results are quantified using different classification quality metrics and compared to the results from the previous article and literature. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets. The correlation between the size of the original dataset and the quality of classification is visible independently from the augmentation approach.
EN
Mosquito borne diseases pose a substantial threat to public health. Vector surveillance and vector control approaches are critical to diminish the mosquito population. Quick and precise identification of mosquito species predominant in a geographic area is essential for ecological monitoring and devise effective vector control strategies in the targeted areas. There has been a growing interest in fine tuning the pretrained deep convolutional neural network models for the vision based identification of insect genera, species and gender. Transfer learning is a technique commonly applied to adapt a pre-trained model for a specific task on a different dataset especially when the new dataset has limited number of training images. In this research work, we investigate the capability of deep transfer learning to solve the multi-class classification problem of mosquito species identification. We train the pretrained deep convolutional neural networks in two transfer learning approaches: (i) Feature Extraction and (ii) Fine-tuning. Three state-of-the-art pretrained models including VGG-16, ResNet-50 and GoogLeNet were trained on a dataset of mobile captured images of three vector mosquito species: Aedes Aegypti , Anopheles Stephensi and Culex Quinquefasciatus. The results of the experiments show that GoogLeNet outperformed the other two models by achieving classification accuracy of 92.5% in feature extraction transfer learning and 96% with fine-tuning. Also, it was observed that fine-tuning the pretrained models improved the classification accuracy.
EN
The assessment of dynamic performance of large-scale bridges typically relies on the deployment of wired instrumentation systems requiring direct contact with the tested structures. This can obstruct their operation and create unnecessary risks to the involved personnel and equipment. These problems can be readily avoided by using non-contact instrumentation systems. However, the cost of of-the-shelf commercial products often prevents their wide adoption in engineering practice. To this end, the dynamic performance of the biggest one-pylon cable-stayed bridge in Poland is investigated based on data from a consumer-grade digital camera and open access image-processing algorithms. The quality of these data is benchmarked against data obtained from conventional wired accelerometers and a high-end commercial optical motion capture system. Operational modal analysis is conducted to extract modal damping, which has a potential to serve as an indicator of structural health. The dynamic properties of the bridge are evaluated against the results obtained during a proof loading exercise undertaken prior to the bridge opening. It is shown that a vibration monitoring system based on consumer-grade digital camera can indeed provide an economically viable alternative to monitoring the complex time-evolving dynamic behaviour patterns of large-scale bridges.
EN
The capacity to navigate effectively in complex environments is a crucial prerequisite for mobile robots. In this study, the YOLOv5 model is utilized to identify objects to aid the mobile robot in determining movement conditions. However, the limitation of deep learning models being trained on insufficient data, leading to inaccurate recognition in unforeseen scenarios, is addressed by introducing an innovative computer vision technology that detects lanes in real-time. Combining the deep learning model with computer vision technology, the robot can identify different types of objects, allowing it to estimate distance and adjust speed accordingly. Additionally, the paper investigates the recognition reliability in varying light intensities. When the light illumination increases from 300 lux to 1000 lux, the reliability of the recognition model on different objects also improves, from about 75% to 98%, respectively. The findings of this study offer promising directions for future breakthroughs in mobile robot navigation.
EN
As the basic technology of human action recognition, pose estimation is attracting more and more researchers' attention, while edge application scenarios pose a higher challenge. This paper proposes a lightweight multi-person pose estimation scheme to meet the needs of real-time human action recognition on the edge end. This scheme uses AlphaPose to extract human skeleton nodes, and adds ResNet and Dense Upsampling Revolution to improve its accuracy. Meanwhile, we use YOLO to enhance AlphaPose’s support for multi-person pose estimation, and optimize the proposed model with TensorRT. In addition, this paper sets Jetson Nano as the Edge AI deployment device of the proposed model and successfully realizes the model migration to the edge end. The experimental results show that the speed of the optimized object detection model can reach 20 FPS, and the optimized multi-person pose estimation model can reach 10 FPS. With the image resolution of 320×240, the model’s accuracy is 73.2%, which can meet the real-time requirements. In short, our scheme can provide a basis for lightweight multi-person action recognition scheme on the edge end.
14
Content available Attention-based U-Net for image demoiréing
EN
Image demoiréing is a particular example of a picture restoration problem. Moiré is an interference pattern generated by overlaying similar but slightly offset templates. In this paper, we present a deep learning based algorithm to reduce moiré disruptions. The proposed solution contains an explanation of the cross-sampling procedure - the training dataset management method which was optimized according to limited computing resources. Suggested neural network architecture is based on Attention U-Net structure. It is an exceptionally effective model which was not proposed before in image demoiréing systems. The greatest improvement of this model in comparison to U-Net network is the implementation of attention gates. These additional computing operations make the algorithm more focused on target structures. We also examined three MSE and SSIM based loss functions. The SSIM index is used to predict the perceived quality of digital images and videos. A similar approach was applied in various computer vision areas. The author’s main contributions to the image demoiréing problem contain the use of the novel architecture for this task, innovative two-part loss function, and the untypical use of the cross-sampling training procedure.
EN
The aim of this work is to present a new methodology for the automated analysis of the cross-sections of experimental chip shapes. It enables, based on image processing methods, the determination of average chip thicknesses, chip curling radii and for segmented chips the extraction of chip segmentation lengths, as well as minimum and maximum chip thicknesses. To automatically decide whether a chip at hand should be evaluated using the proposed methods for continuous or segmented chips, a convolutional neural network is proposed, which is trained using supervised learning with available images from embedded chip cross-sections. Data from manual measurements are used for comparison and validation purposes.
EN
There is a great range of spectacular coral reefs in the ocean world. Unfortunately, they are in jeopardy, due to an overabundance of one specific starfish called the coral-eating crown-of-thorns starfish (or COTS). This article provides research to deliver innovation in COTS control. Using a deep learning model based on the You Only Look Once version 5 (YOLOv5) deep learning algorithm on an embedded device for COTS detection. It aids professionals in optimizing their time, resources, and enhances efficiency for the preservation of coral reefs worldwide. As a result, the performance over the algorithm was outstanding with Precision: 0.93 - Recall: 0.77 - F1score: 0.84.
EN
Medical history highlights that myocardial infarction is one of the leading factors of death in human beings. Angina pectoris is a prominent vital sign of myocardial infarction. Medical reports suggest that experiencing chest pain during heart attacks causes changes in facial muscles, resulting in variations in patterns of facial expression. This work intends to develop an automatic facial expression detection to identify the severity of chest pain as a vital sign of MI, using an algorithmic approach that is implemented with a state-of-the-art convolutional neural network (CNN). The advanced object detection lightweight CNN models are as follows: Single Shot Detector Mobile Net V2, and Single Shot Detector Inception V2, which were utilized for designing the vital signs MI model from the 500 Red Blue Green Color images private dataset. The authors developed cardiac emergency health monitoring care using an Edge Artificial Intelligence (“Edge AI”) using NVIDIA’s Jetson Nano embedded GPU platform. The proposed model is mainly focused on the factors of low cost and less power consumption for onboard real-time detection of vital signs of myocardial infarction. The evaluated metrics achieve a mean Average Precision of 85.18%, Average Recall of 88.32%, and 6.85 frames per second for the generated detections.
EN
Recognizing faces under various lighting conditions is a challenging problem in artificial intelligence and applications. In this paper we describe a new face recognition algorithm which is invariant to illumination. We first convert image files to the logarithm domain and then we implement them using the dual-tree complex wavelet transform (DTCWT) which yields images approximately invariant to changes in illumination change. We classify the images by the collaborative representation-based classifier (CRC). We also perform the following sub-band transformations: (i) we set the approximation sub-band to zero if the noise standard deviation is greater than 5; (ii) we then threshold the two highest frequency wavelet sub-bands using bivariate wavelet shrinkage. (iii) otherwise, we set these two highest frequency wavelet sub-bands to zero. On obtained images we perform the inverse DTCWT which results in illumination invariant face images. The proposed method is strongly robust to Gaussian white noise. Experimental results show that our proposed algorithm outperforms several existing methods on the Extended Yale Face Database B and the CMU-PIE face database.
19
Content available Image caption generation using transfer learning
EN
This paper describes an image caption generation system using deep neural networks. The model is trained to maximize the probability of generated sentence, given the image. The model utilizes transfer learning in the form of pretrained convolutional neural networks to preprocess the image data. The datasets are composed of a still photographs and associated with it, five captions in English language. Constructed model is compared to other similarly constructed models using BLEU score system and ways to further improve its performance are proposed.
PL
W tym artykule opisano system generujący podpisy do zdjęć z wykorzystaniem głębokich sieci neuronowych. Model jest trenowany pod kątem maksymalizacji prawdopodobieństwa wygenerowanego zdania, dla zadanego obrazu. Model wykorzystuje uczenie transferowe w postaci wytrenowanych wstępnie neuronowych sieci konwolucyjnych. Zbiory danych wykorzystane do trenowania modelu składają się z fotografii, oraz przypisanych do niej pięciu zdań w języku angielskim. Skonstruowany model jest potem porównany z innymi modelami o podobnej konstrukcji z wykorzystaniem punktacji BLEU.
20
Content available remote Intelligent agrobots for crop yield estimation using computer vision
EN
The machine vision-based autonomous intelligent robots perform precise farm tasks suchas robot harvesting, weeding, pest or fertilizer spraying, monitoring, and pruning. Estimating crop yield is an essential assignment on a regional or federal scale. For a long timethe estimation measures were based on the statistics from manual counting of plants ina specific zone. The computer vision algorithms have addressed the technical drawbacksof the conventional image processing techniques and established an autonomous disciplineand yielded new approaches to crop planning. A method for quantitative assessment ofa tomato crop has been developed in this research using color thresholding in MATLAB using the RGB color model. Converting an RGB image to a grayscale image is one of thesteps involved in detecting red color in a taken image. After subtracting the two images,a median filter is employed to filter the noisy pixels to produce a two-dimensional blackand white image. The bounding boxes are used to label the binary digital images to detectrelated components, and the parameters of the labeled regions are computed to measurethe number of tomatoes in a crop. The obtained R2 correlation coefficient between thetomato berry counting algorithm and human counting was 0.98. Furthermore, the color ofeach pixel in the acquired image is evaluated by examining RGB values for pixel intensitiesin the obtained image. The performance of the berry counting algorithm was evaluated,and the technique was determined to have a high precision and recognition ratio of 96%.The research indicates that this technique may be used to estimate the crop yield, whichis helpful information for forecasting yields, planning harvest plans, and generating prescription maps for field-specific management strategies. The proposed model performedexceptionally well in estimating yield with each tomato (Solanum lycopersicum) crop.
first rewind previous Strona / 8 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.