Wyniki wyszukiwania - BazTech

1

Research on stacked ore detection based on improved Mask RCNN under complex background

Zhou Hehui, Cai Gaipin, Liu Shun

Gospodarka Surowcami Mineralnymi

|

2023

|

T. 39, z. 1

131--148

EN

In order to achieve accurate identification and segmentation of ore under complex working conditions, machine vision and neural network technology are used to carry out intelligent detection research on ore, an improved Mask RCNN instance segmentation algorithm is proposed. Aiming at the problem of misidentification of stacked ores caused by the loss of deep feature details during the feature extraction process of ore images, an improved Multipath Feature Pyramid Network (MFPN) was proposed. The network firstly adds a single bottom-up feature fusion path, and then adds with the top-down feature fusion path of the original algorithm, which can enrich the deep feature details and strengthen the fusion of the network to the feature layer, and improve the accuracy of the network to the ore recognition. The experimental results show that the algorithm proposed in this paper has a recognition accuracy of 96.5% for ore under complex working conditions, and the recall rate and recall rate function values reach 97.4% and 97.0% respectively, and the AP75 value is 6.84% higher than the original algorithm. The detection results of the ore in the actual scene show that the mask size segmented by the network is close to the actual size of the ore, indicating that the improved network model proposed in this paper has achieved a good performance in the detection of ore under different illumination, pose and background. Therefore, the method proposed in this paper has a good application prospect for stacked ore identification under complex working conditions.

PL

Aby uzyskać dokładną identyfikację i segmentację rudy w złożonych warunkach pracy, do prowadzenia inteligentnych badań wykrywania rudy wykorzystywane są technologie wizji maszynowej i sieci neuronowych, zaproponowano udoskonalony algorytm segmentacji obrazu Mask RCNN (Region Convolutional Neural Networks). Mając na celu rozwiązanie problemu błędnej identyfikacji ułożonych rud, spowodowanego utratą głębokich szczegółów cech podczas procesu ekstrakcji cech z obrazów rudy, zaproponowano ulepszoną sieć wielościeżkową piramidy cech MFPN (Multipath Feature Pyramid Network). Sieć najpierw dodaje pojedynczą ścieżkę łączenia funkcji od dołu do góry, a następnie dodaje ścieżkę łączenia funkcji od góry do dołu oryginalnego algorytmu, co może wzbogacić głębokie szczegóły funkcji i wzmocnić połączenie sieci z warstwą funkcji (obiektową) i poprawić dokładność sieci do rozpoznawania rudy. Wyniki eksperymentalne pokazują, że algorytm zaproponowany w niniejszej pracy ma dokładność rozpoznawania na poziomie 96,5% dla rudy w złożonych warunkach pracy, a wartości współczynnika czułości i współczynnika czułości funkcji osiągają odpowiednio 97,4 i 97,0%, a wartość AP75 jest wyższa o 6,84% niż oryginalny algorytm. Wyniki wykrywania rudy w rzeczywistej scenie pokazują, że rozmiar maski podzielonej na segmenty przez sieć jest zbliżony do rzeczywistego rozmiaru rudy, co wskazuje, że ulepszony model sieci zaproponowany w tym artykule osiągnął dobrą efektywność w wykrywaniu rudy przy różnym oświetleniu, ułożeniu i tle. Dlatego zaproponowana w pracy metoda ma dobre perspektywy aplikacyjne do identyfikacji usypanych rud w złożonych warunkach pracy.

2

Transformer-based cross-modal multi-contrast network for ophthalmic diseases diagnosis

Yu Yang, Zhu Hongqing

Biocybernetics and Biomedical Engineering

|

2023

|

Vol. 43, no. 3

507--527

EN

Automatic diagnosis of various ophthalmic diseases from ocular medical images is vital to support clinical decisions. Most current methods employ a single imaging modality, especially 2D fundus images. Considering that the diagnosis of ophthalmic diseases can greatly benefit from multiple imaging modalities, this paper further improves the accuracy of diagnosis by effectively utilizing cross-modal data. In this paper, we propose Transformerbased cross-modal multi-contrast network for efficiently fusing color fundus photograph (CFP) and optical coherence tomography (OCT) modality to diagnose ophthalmic diseases. We design multi-contrast learning strategy to extract discriminate features from crossmodal data for diagnosis. Then channel fusion head captures the semantically shared information across different modalities and the similarity features between patients of the same category. Meanwhile, we use a class-balanced training strategy to cope with the situation that medical datasets are usually class-imbalanced. Our method is evaluated on public benchmark datasets for cross-modal ophthalmic disease diagnosis. The experimental results demonstrate that our method outperforms other approaches. The codes and models are available at https://github.com/ecustyy/tcmn.

3

Research on Scattering Feature Extraction of Underwater Moving Cluster Targets Based on the Highlight Model

Yang Yang, Fan Jun, Wang Bing

Archives of Acoustics

|

2023

|

Vol. 48, No. 2

235--247

EN

In detecting cluster targets in ports or near-shore waters, the echo amplitude is seriously disturbed by interface reverberation, which leads to the distortion of the traditional target intensity characteristics, and the appearance of multiple targets in the same or adjacent beam leads to fuzzy feature recognition. Studying and extracting spatial distribution scale and motion features that reflect the information on cluster targets physics can improve the representation accuracy of cluster target characteristics. Based on the highlight model of target acoustic scattering, the target azimuth tendency is accurately estimated by the splitting beam method to fit the spatial geometric scale formed by multiple highlights. The instantaneous frequencies of highlights are extracted from the time-frequency domain, the Doppler shift of the highlights is calculated, and the motion state of the highlights is estimated. Based on the above processing method, target highlights’ orientation, spatial scale and motion characteristics are fused, and the multiple moving highlights of typical formation distribution in the same beam are accurately identified. The features are applied to processing acoustic scattering data of multiple moving unmanned underwater vehicles (UUVs) on a lake. The results show that multiple small moving underwater targets can be effectively recognized according to the highlight scattering characteristics.

4

Single target tracking algorithm for lightweight Siamese networks based on global attention

Wang Zhentao, He Xiaowei, Cheng Rao

Bulletin of the Polish Academy of Sciences. Technical Sciences

|

2022

|

Vol. 70, nr 3

art. no. e139961

EN

Object tracking based on Siamese networks has achieved great success in recent years, but increasingly advanced trackers are also becoming cumbersome, which will severely limit deployment on resource-constrained devices. To solve the above problems, we designed a network with the same or higher tracking performance as other lightweight models based on the SiamFC lightweight tracking model. At the same time, for the problems that the SiamFC tracking network is poor in processing similar semantic information, deformation, illumination change, and scale change, we propose a global attention module and different scale training and testing strategies to solve them. To verify the effectiveness of the proposed algorithm, this paper has done comparative experiments on the ILSVRC, OTB100, VOT2018 datasets. The experimental results show that the method proposed in this paper can significantly improve the performance of the benchmark algorithm.

5

Ultrasound spine image segmentation using multi-scale feature fusion Skip-Inception U-Net (SIU-Net)

Banerjee Sunetra, Lyu Juan, Huang Zixun, Leung Frank H.F, Lee Timothy, Yang De, Su Steven, Zheng Yongping, Ling Sai Ho

Biocybernetics and Biomedical Engineering

|

2022

|

Vol. 42, no. 1

341--361

EN

Scoliosis is a 3D spinal deformation where the spine takes a lateral curvature, forming an angle in the coronal plane. Diagnosis of scoliosis requires periodic detection, and frequent exposure to radiative imaging may cause cancer. A safer and more economical alternative imaging, i.e., 3D ultrasound imaging modality, is being explored. However, unlike other radiative modalities, an ultrasound image is noisy, which often suppresses the image’s useful information. Through this research, a novel hybridized CNN architecture, multi-scale feature fusion Skip-Inception U-Net (SIU-Net), is proposed for a fully automatic bony feature detection, which can be further used to assess the severity of scoliosis safely and automatically. The proposed architecture, SIU-Net, incorporates two novel features into the basic U-Net architecture: (a) an improvised Inception block and (b) newly designed decoder-side dense skip pathways. The proposed model is tested on 109 spine ultrasound image datasets. The architecture is evaluated using the popular (i) Jaccard Index (ii) Dice Coefficient and (iii) Euclidean distance, and compared with (a) the basic U-net segmentation model, (b) a more evolved UNet++ model, and (c) a newly developed MultiResUNet model. The results show that SIU-Net gives the clearest segmentation output, especially in the important regions of interest such as thoracic and lumbar bony features. The method also gives the highest average Jaccard score of 0.781 and Dice score of 0.883 and the lowest histogram Euclidean distance of 0.011 than the other three models. SIU-Net looks promising to meet the objectives of a fully automatic scoliosis detection system.

6

Dual-channel asymmetric convolutional neural network for an efficient retinal blood vessel segmentation in eye fundus images

Xu Yanan, Fan Yingle

Biocybernetics and Biomedical Engineering

|

2022

|

Vol. 42, no. 2

695--706

EN

The morphological properties of retinal vessels are closely related to the diagnosis of ophthalmic diseases. However, many problems in retinal images, such as complicated directions of vessels and difficult recognition of capillaries, bring challenges to the accurate segmentation of retinal blood vessels. Thus, we propose a new retinal blood vessel segmentation method based on a dual-channel asymmetric convolutional neural network (CNN). First, we construct the thick and thin vessel extraction module based on the morphological differences in retinal vessels. A two-dimensional (2D) Gabor filter is used to perceive the scale characteristics of blood vessels after selecting the direction of blood vessels; thereby, adaptively extracting the thick vessel features characterizing the overall characteristics and the thin vessel features preserving the capillaries from fundus images. Then, considering that the single-channel network is unsuitable for the unified characterization of thick and thin vessels, we develop a dual-channel asymmetric CNN based on the U-Net model. The MainSegment-Net uses the step-by-step connection mode to achieve rapid positioning and segmentation of thick vessels; the FineSegment-Net combines dilated convolution and the skip connection to achieve the fine extraction of thin vessels. Finally, the output of the dual-channel asymmetric CNN is fused and coded to combine the segmentation results of thick and thin vessels. The performance of our method is evaluated and tested by DRIVE and CHASE_DB1. The results show that the accuracy (Acc), sensitivity (SE), and specificity (SP) of our method on the DRIVE database are 0.9630, 0.8745, and 0.9823, respectively. The evaluation indexes Acc, SE, and SP of the CHASE_DB1 database are 0.9694, 0.8916, and 0.9794, respectively. Additionally, our method combines the biological vision mechanism with deep learning to achieve rapid and automatic segmentation of retinal vessels, providing a new idea for diagnosing and analyzing subsequent medical images.

7

Multi-path convolutional neural network in fundus segmentation of blood vessels

Tian Chun, Fang Tao, Fan Yingle, Wu Wei

Biocybernetics and Biomedical Engineering

|

2020

|

Vol. 40, no. 2

583--595

EN

There is a close correlation between retinal vascular status and physical diseases such as eye lesions. Retinal fundus images are an important basis for diagnosing diseases such as diabetes, glaucoma, hypertension, coronary heart disease, etc. Because the thickness of the retinal blood vessels is different, the minimum diameter is only one or two pixels wide, so obtaining accurate measurement results becomes critical and challenging. In this paper, we propose a new method of retinal blood vessel segmentation that is based on a multi-path convolutional neural network, which can be used for computer-based clinical medical image analysis. First, a low-frequency image characterizing the overall characteristics of the retinal blood vessel image and a high-frequency image characterizing the local detailed features are respectively obtained by using a Gaussian low-pass filter and a Gaussian high-pass filter. Then a feature extraction path is constructed for the characteristics of the low- and high-frequency images, respectively. Finally, according to the response results of the low-frequency feature extraction path and the high-frequency feature extraction path, the whole blood vessel perception and local feature information fusion coding are realized, and the final blood vessel segmentation map is obtained. The performance of this method is evaluated and tested by DRIVE and CHASE_DB1. In the experimental results of the DRIVE database, the evaluation indexes accuracy (Acc), sensitivity (SE), and specificity (SP) are 0.9580, 0.8639, and 0.9665, respectively, and the evaluation indexes Acc, SE, and SP of the CHASE_DB1 database are 0.9601, 0.8778, and 0.9680, respectively. In addition, the method proposed in this paper could effectively suppress noise, ensure continuity after blood vessel segmentation, and provide a feasible new idea for intelligent visual perception of medical images.

8

Hierarchical classification of normal, fatty and heterogeneous liver diseases from ultrasound images using serial and parallel feature fusion

Alivar A., Danyali H., Helfroush M. S.

Biocybernetics and Biomedical Engineering

|

2016

|

Vol. 36, no. 4

697--707

EN

This study presents a computer-aided diagnostic system for hierarchical classification of normal, fatty, and heterogeneous liver ultrasound images using feature fusion techniques. Both spatial and transform domain based features are used in the classification, since they have positive effects on the classification accuracy. After extracting gray level co-occurrence matrix and completed local binary pattern features as spatial domain features and a number of statistical features of 2-D wavelet packet transform sub-images and 2-D Gabor filter banks transformed images as transform domain features, particle swarm optimization algorithm is used to select dominant features of the parallel and serial fused feature spaces. Classification is performed in two steps: First, focal livers are classified from the diffused ones and second, normal livers are distinguished from the fatty ones. For the used database, the maximum classification accuracy of 100% and 98.86% is achieved by serial and parallel feature fusion modes, respectively, using leave-one-out cross validation (LOOCV) method and support vector machine (SVM) classifier.

9

A Scheme for Template Security at Feature Fusion Level in Multimodal Biometric System

Selwal A., Gupta S. K., Kumar S.

Advances in Science and Technology. Research Journal

|

2016

|

Vol. 10, nr 31

23--30

EN

Biometrics is the science of human recognition by means of their biological, chemical or behavioural traits. These systems are used in many real life applications simply from biometric based attendance system to providing security at a very sophisticated level. A biometric system deals with raw data captured using a sensor and feature template extracted from raw image. One of the challenges being faced by designers of these systems is to secure template data extracted from the biometric modalities of the user and protect the raw images. In order to minimize spoof attacks on biometric systems by unauthorised users one of the solutions is to use multi-biometric systems. Multi-modal biometric system works by using fusion technique to merge feature templates generated from different modalities of the human. In this work, a novel scheme is proposed to secure template during feature fusion level. The scheme is based on union operation of fuzzy relations of templates of modalities during fusion process of multimodal biometric systems. This approach serves dual purpose of feature fusion as well as transformation of templates into a single secured non invertible template. The proposed technique is irreversible, diverse and experimentally tested on a bimodal biometric system comprising of fingerprint and hand geometry. The given scheme results into significant improvement in the performance of the system with lower equal error rate and improvement in genuine acceptance rate.

10

Feature fusion of palmprint and face via tensor analysis and curvelet transform

Xu X., Guan X., Zhang D., Zhang X., Deng W.

Opto-Electronics Review

|

2012

|

Vol. 20, No. 2

138-147

EN

In order to improve the recognition accuracy of the unimodal biometric system and to address the problem of the small samples recognition, a multimodal biometric recognition approach based on feature fusion level and curve tensor is proposed in this paper. The curve tensor approach is an extension of the tensor analysis method based on curvelet coefficients space. We use two kinds of biometrics: palmprint recognition and face recognition. All image features are extracted by using the curve tensor algorithm and then the normalized features are combined at the feature fusion level by using several fusion strategies. The k-nearest neighbour (KNN) classifier is used to determine the final biometric classification. The experimental results demonstrate that the proposed approach outperforms the unimodal solution and the proposed nearly Gaussian fusion (NGF) strategy has a better performance than other fusion rules.