Ograniczanie wyników
Czasopisma help
Autorzy help
Lata help
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 56

Liczba wyników na stronie
first rewind previous Strona / 3 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  sieć neuronowa konwolucyjna
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 3 next fast forward last
EN
The Lombard effect is an involuntary increase in the speaker’s pitch, intensity, and duration in the presence of noise. It makes it possible to communicate in noisy environments more effectively. This study aims to investigate an efficient method for detecting the Lombard effect in uttered speech. The influence of interfering noise, room type, and the gender of the person on the detection process is examined. First, acoustic parameters related to speech changes produced by the Lombard effect are extracted. Mid-term statistics are built upon the parameters and used for the self-similarity matrix construction. They constitute input data for a convolutional neural network (CNN). The self-similarity-based approach is then compared with two other methods, i.e., spectrograms used as input to the CNN and speech acoustic parameters combined with the k-nearest neighbors algorithm. The experimental investigations show the superiority of the self-similarity approach applied to Lombard effect detection over the other two methods utilized. Moreover, small standard deviation values for the self-similarity approach prove the resulting high accuracies.
2
Content available Denseformer for single image deraining
EN
Image is one of the most important forms of information expression in multimedia. It is the key factor to determine the visual effect of multimedia software. As an image restoration task, image deraining can effectively restore the original information of the image, which is conducive to the downstream task. In recent years, with the development of deep learning technology, CNN and Transformer structures have shone brightly in computer vision. In this paper, we summarize the key to success of these structures in the past, and on this basis, we introduce the concept of a layer aggregation mechanism to describe how to reuse the information of the previous layer to better extract the features of the current layer. Based on this layer aggregation mechanism, we build the rain removal network called DenseformerNet. Our network strengthens feature promotion and encourages feature reuse, allowing better information and gradient flow. Through a large number of experiments, we prove that our model is efficient and effective, and expect to bring some illumination to the future rain removal network.
EN
Obstructive sleep apnea (OSA) is a long-term sleep disorder that causes temporary disruption in breathing while sleeping. Polysomnography (PSG) is the technique for monitoring different signals during the patient’s sleep cycle, including electroencephalogram (EEG), electromyography (EMG), electrocardiogram (ECG), and oxygen saturation (SpO2). Due to the high cost and inconvenience of polysomnography, the usefulness of ECG signals in detecting OSA is explored in this work, which proposes a two-dimensional convolutional neural network (2D-CNN) model for detecting OSA using ECG signals. A publicly available apnea ECG database from PhysioNet is used for experimentation. Further, a constant Q-transform (CQT) is applied for segmentation, filtering, and conversion of ECG beats into images. The proposed CNN model demonstrates an average accuracy, sensitivity and specificity of 91.34%, 90.68% and 90.70%, respectively. The findings obtained using the proposed approach are comparable to those of many other existing methods for automatic detection of OSA.
EN
Transfer learning has surfaced as a compelling technique in machine learning, enabling the transfer of knowledge across networks. This study evaluates the efficacy of ImageNet pretrained state-of-the-art networks, including DenseNet, ResNet, and VGG, in implementing transfer learning for prepruned models on compact datasets, such as FashionMNIST, CIFAR10, and CIFAR100. The primary objective is to reduce the number of neurons while preserving high-level features. To this end, local sensitivity analysis is employed alongside p-norms and various reduction levels. This investigation discovers that VGG16, a network rich in parameters, displays resilience to high-level feature pruning. Conversely, the ResNet architectures reveal an interesting pattern of increased volatility. These observations assist in identifying an optimal combination of the norm and the reduction level for each network architecture, thus offering valuable directions for model-specific optimization. This study marks a significant advance in understanding and implementing effective pruning strategies across diverse network architectures, paving the way for future research and applications.
EN
Face recognition (FR) is one of the most active research areas in the field of computer vision. Convolutional neural networks (CNNs) have been extensively used in this field due to their good efficiency. Thus, it is important to find the best CNN parameters for its best performance. Hyperparameter optimization is one of the various techniques for increasing the performance of CNN models. Since manual tuning of hyperparameters is a tedious and time-consuming task, population based metaheuristic techniques can be used for the automatic hyperparameter optimization of CNNs. Automatic tuning of parameters reduces manual efforts and improves the efficiency of the CNN model. In the proposed work, genetic algorithm (GA) based hyperparameter optimization of CNNs is applied for face recognition. GAs are used for the optimization of various hyperparameters like filter size as well as the number of filters and of hidden layers. For analysis, a benchmark dataset for FR with ninety subjects is used. The experimental results indicate that the proposed GA-CNN model generates an improved model accuracy in comparison with existing CNN models. In each iteration, the GA minimizes the objective function by selecting the best combination set of CNN hyperparameters. An improved accuracy of 94.5% is obtained for FR.
EN
In this paper, we propose a method for reducing thermal noise in diffusion-weighted magnetic resonance images (DWI MRI) of the brain using a convolutional neural network (CNN) trained on realistic, synthetic MR data. Two reference methods are considered: a) averaging of repeated scans, a widespread method used in clinics to improve signal-to-noise ratio of MR images and b) the blockwise Non-Local Means (NLM) filter, one of the post-processing methods frequently used in DWI denoising. To obtain training data for transfer learning, the effects of echo-planar imaging (EPI) – Nyquist ghosting and ramp sampling – are modelled in a data-driven fashion. These effects are introduced to the digital phantom of brain anatomy (BrainWeb). Real noise maps are obtained from the MRI scanner with a brainDWI-designed protocol and later combined with simulated, noise-free EPI images. The Point Spread Function is measured in a DW image of an AJR-approved geometrical phantom. Inter-scan patient movement is captured from a brain scan of a healthy volunteer using image registration. The denoising methods are applied to the simulated EPI brain images and in real EPI DWI of the brain. The quality of denoised images is evaluated at several signal-to-noise ratios. The characteristics of noise residuals are studied thoroughly. A diffusion phantom is used to investigate the influence of denoising on ADC measurements. The method is also evaluated on a GRAPPA dataset. We show that our method outperforms NLM and image averaging and allows for a significant reduction in scan time by lowering the number of repeated scans. We also analyse the trained CNN denoisers and point out the challenges accompanying this denoising method.
EN
Uncontrolled diabetes leads to serious complications comparable to cancer. Infected foot ulcer causes a 5-year mortality of 50%. Proper treatment of foot wounds is essential, and wound area monitoring plays an important role in this area. In this article, we describe an automatic wound area measurement service that facilitates area measurement and the measurement result is based on adaptive calibration for larger accuracy at curved surfaces. Users need to take a digital picture of a wound and calibration markers and send them for analysis using an Internet page. The deep learning model based on convolutional neural networks (CNNs) was trained using 565 wound images and was used for image segmentation to identify the wound and calibration markers. The developed software calculates the wound area based on the number of pixels in the wound region and the calibration coefficient determined from distances between ticks at calibration markers. The result of the measurement is sent back to the user at the provided e-mail address. The median relative error of wound area measurement in the wound models was 1.21%. The efficacy of the CNN model was tested on 41 wounds and 73 wound models. The averaged values for the dice similarity coefficient, intersection over union, accuracy and specificity for wound identification were 90.9%, 83.9%, 99.3% and 99.6%, respectively. The service proved its high efficacy and can be used in wound area monitoring. The service may be used not only by health care specialists but also by patients. Thus, it is important tool for wound healing monitoring.
EN
Accurate nuclei segmentation is a critical step for physicians to achieve essential information about a patient’s disease through digital pathology images, enabling an effective diagnosis and evaluation of subsequent treatments. Since pathology images contain many nuclei, manual segmentation is time-consuming and error-prone. Therefore, developing a precise and automatic method for nuclei segmentation is urgent. This paper proposes a novel multi-task segmentation network that incorporates background and contour segmentation into the nuclei segmentation method and produces more accurate segmentation results. The convolution and attention modules are merged with the model to increase its global focus and enhance good segmentation results indirectly. We propose a reverse feature enhance module for contour extraction that facilitates feature integration between auxiliary tasks. The multi-feature fusion module is embedded in the final decoding branch to use different levels of features from auxiliary segmentation branches with varying concerns. We evaluate the proposed method on four challenging nuclei segmentation datasets. The proposed method achieves excellent performance on all four datasets. We found that the Dice coefficient reached 0.8563±0.0323, 0.8183±0.0383, 0.9222±0.0216, and 0.9220±0.0602 on the TNBC, MoNuSeg, KMC, and Glas. Our method produces better boundary accuracy and less sticking than other end-to-end segmentation methods. The results show that our method can perform better than other proposed state-of-the-art methods.
EN
Around the world, several lung diseases such as pneumonia, cardiomegaly, and tuberculosis (TB) contribute to severe illness, hospitalization or even death, particularly for elderly and medically vulnerable patients. In the last few decades, several new types of lungrelated diseases have taken the lives of millions of people, and COVID-19 has taken almost 6.27 million lives. To fight against lung diseases, timely and correct diagnosis with appropriate treatment is crucial in the current COVID-19 pandemic. In this study, an intelligent recognition system for seven lung diseases has been proposed based on machine learning (ML) techniques to aid the medical experts. Chest X-ray (CXR) images of lung diseases were collected from several publicly available databases. A lightweight convolutional neural network (CNN) has been used to extract characteristic features from the raw pixel values of the CXR images. The best feature subset has been identified using the Pearson Correlation Coefficient (PCC). Finally, the extreme learning machine (ELM) has been used to perform the classification task to assist faster learning and reduced computational complexity. The proposed CNN-PCC-ELM model achieved an accuracy of 96.22% with an Area Under Curve (AUC) of 99.48% for eight class classification. The outcomes from the proposed model demonstrated better performance than the existing state-of-the-art (SOTA) models in the case of COVID-19, pneumonia, and tuberculosis detection in both binary and multiclass classifications. For eight class classification, the proposed model achieved precision, recall and fi-score and ROC are 100%, 99%, 100% and 99.99% respectively for COVID-19 detection demonstrating its robustness. Therefore, the proposed model has overshadowed the existing pioneering models to accurately differentiate COVID-19 from the other lung diseases that can assist the medical physicians in treating the patient effectively.
EN
Motor imagery (MI) decoding is the core of an intelligent rehabilitation system in brain computer interface, and it has a potential advantage by using source signals, which have higher spatial resolution and the same time resolution compared to scalp electroencephalography (EEG). However, how to delve and utilize the personalized frequency characteristic of dipoles for improving decoding performance has not been paid sufficient attention. In this paper, a novel dipole feature imaging (DFI) and a hybrid convolutional neural network (HCNN) with an embedded squeeze-and-excitation block (SEB), denoted as DFI-HCNN, are proposed for decoding MI tasks. EEG source imaging technique is used for brain source estimation, and each sub-band spectrum powers of all dipoles are calculated through frequency analysis and band division. Then, the 3D space information of dipoles is retrieved, and by using azimuthal equidistant projection algorithm it is transformed to a 2D plane, which is combined with nearest neighbor interpolation to generate multi sub-band dipole feature images. Furthermore, a HCNN is designed and applied to the ensemble of sub-band dipole feature images, from which the importance of sub-bands is acquired to adjust the corresponding attentions adaptively by SEB. Ten-fold cross-validation experiments on two public datasets achieve the comparatively higher decoding accuracies of 84.23% and 92.62%, respectively. The experiment results show that DFI is an effective feature representation, and HCNN with an embedded SEB can enhance the useful frequency information of dipoles for improving MI decoding.
EN
COVID-19 had caused the whole world to come to a standstill. The current detection methods are time consuming as well as costly. Using Chest X-rays (CXRs) is a solution to this problem, however, manual examination of CXRs is a cumbersome and difficult process needing specialization in the domain. Most of existing methods used for this application involve the usage of pretrained models such as VGG19, ResNet, DenseNet, Xception, and EfficeintNet which were trained on RGB image datasets. X-rays are fundamentally single channel images, hence using RGB trained model is not appropriate since it increases the operations by involving three channels instead of one. A way of using pretrained model for grayscale images is by replicating the one channel image data to three channel which introduces redundancy and another way is by altering the input layer of pretrained model to take in one channel image data, which comprises the weights in the forward layers that were trained on three channel images which weakens the use of pre-trained weights in a transfer learning approach. A novel approach for identification of COVID-19 using CXRs, Contrast Limited Adaptive Histogram Equalization (CLAHE) along with Homomorphic Transformation Filter which is used to process the pixel data in images and extract features from the CXRs is suggested in this paper. These processed images are then provided as input to a VGG inspired deep Convolutional Neural Network (CNN) model which takes one channel image data as input (grayscale images) to categorize CXRs into three class labels, namely, No-Findings, COVID-19, and Pneumonia. Evaluation of the suggested model is done with the help of two publicly available datasets; one to obtain COVID-19 and No-Finding images and the other to obtain Pneumonia CXRs. The dataset comprises 6750 images in total; 2250 images for each class. Results obtained show that the model has achieved 96.56% for multi-class classification and 98.06% accuracy for binary classification using 5-fold stratified cross validation (CV) method. This result is competitive and up to the mark when compared with the performance shown by existing approaches for COVID-19 classification.
EN
Cerebral malaria (CM) is a fatal syndrome found commonly in children less than 5 years old in Sub-saharan Africa and Asia. The retinal signs associated with CM are known as malarial retinopathy (MR), and they include highly specific retinal lesions such as whitening and hemorrhages. Detecting these lesions allows the detection of CM with high specificity. Up to 23% of CM, patients are over-diagnosed due to the presence of clinical symptoms also related to pneumonia, meningitis, or others. Therefore, patients go untreated for these pathologies, resulting in death or neurological disability. It is essential to have a low-cost and high-specificity diagnostic technique for CM detection, for which We developed a method based on transfer learning (TL). Models pre-trained with TL select the good quality retinal images, which are fed into another TL model to detect CM. This approach shows a 96% specificity with low-cost retinal cameras.
EN
Chronic obstructive pulmonary disease (COPD) is a complex and multi-component respiratory disease. Computed tomography (CT) images can characterize lesions in COPD patients, but the image intensity and morphology of lung components have not been fully exploited. Two datasets (Dataset 1 and 2) comprising a total of 561 subjects were obtained from two centers. A multiple instance learning (MIL) method is proposed for COPD identification. First, randomly selected slices (instances) from CT scans and multi-view 2D snapshots of the 3D airway tree and lung field extracted from CT images are acquired. Then, three attention-guided MIL models (slice-CT, snapshot-airway, and snapshot-lung-field models) are trained. In these models, a deep convolution neural network (CNN) is utilized for feature extraction. Finally, the outputs of the above three MIL models are combined using logistic regression to produce the final prediction. For Dataset 1, the accuracy of the slice-CT MIL model with 20 instances was 88.1%. The backbone of VGG-16 outperformed Alexnet, Resnet18, Resnet26, and Mobilenet_v2 in feature extraction. The snapshotairway and snapshot-lung-field MIL models achieved accuracies of 89.4% and 90.0%, respectively. After the three models were combined, the accuracy reached 95.8%. The proposed model outperformed several state-of-the-art methods and afforded an accuracy of 83.1% for the external dataset (Dataset 2). The proposed weakly supervised MIL method is feasible for COPD identification. The effective CNN module and attention-guided MIL pooling module contribute to performance enhancement. The morphology information of the airway and lung field is beneficial for identifying COPD.
14
Content available remote A deformable CNN architecture for predicting clinical acceptability of ECG signal
EN
The degraded quality of the electrocardiogram (ECG) signals is the main source of false alarms in critical care units. Therefore, a preliminary analysis of the ECG signal is required to decide its clinical acceptability. In conventional techniques, different handcrafted features are extracted from the ECG signal based on signal quality indices (SQIs) to predict clinical acceptability. A one-dimensional deformable convolutional neural network (1DDCNN) is proposed in this work to extract features automatically, without manual interference, to detect the clinical acceptability of ECG signals efficiently. In order to create DCNN, the deformable convolution and pooling layers are merged into the regular convolutional neural network (CNN) architecture. In DCNN, the equidistant sampling locations of a regular CNN are replaced with adaptive sampling locations, which improves the network’s ability to learn based on the input. Deformable convolution layers concentrate more on significant segments of the ECG signals rather than giving equal attention to all segments. The proposed method is able to detect acceptable and unacceptable ECG signals with an accuracy of 99.50%, recall of 99.78%, specificity of 99.60%, precision of 99.47%, and F-score of 0.999. Experimental results show that the proposed method performs better than earlier state-of-the-art techniques.
EN
This paper presents a new customized hybrid approach for early detection of cardiac abnormalities using an electrocardiogram (ECG). The ECG is a bio-electrical signal that helps monitor the heart’s electrical activity. It can provide health information about the normal and abnormal physiology of the heart. Early diagnosis of cardiac abnormalities is critical for cardiac patients to avoid stroke or sudden cardiac death. The main aim of this paper is to detect crucial beats that can damage the functioning of the heart. Initially, a modified Pan–Tompkins algorithm identifies the characteristic points, followed by heartbeat segmentation. Subsequently, a different hybrid deep convolutional neural network (CNN) is proposed to experiment on standard and real-time long-term ECG databases. This work successfully classifies several cardiac beat abnormalities such as supra-ventricular ectopic beats (SVE), ventricular beats (VE), intra-ventricular conduction disturbances beats (IVCD), and normal beats (N). The obtained classification results show a better accuracy of 99.28% with an F1 score of 99.24% with the MIT–BIH database and a descent accuracy of 99.12% with the real-time acquired database.
EN
Contrast-enhanced magnetic resonance imaging (CE-MRI) is one of the methods routinely used in clinics for the diagnosis of renal impairments. It allows assessment of kidney perfusion and also visualization of various lesions and tissue atrophy due to e.g. renal artery stenosis (RAS). An important indicator of the renal tissue state is the volume and shape of the kidney. Therefore it is highly desirable to equip radiological units in clinics with the software capable of automatic segmentation of the kidneys in CE-MRI images. This paper proposes a solution to this task using an original architecture of a deep neural network. The proposed design employs a three-branch convolutional neural network specialized in: 1) detection of renal parenchyma within an MR image patch, 2) segmentation of the whole kidney and 3) annotation of the renal cortex. We tested our architecture for normal kidneys in healthy subjects and for poorly perfused organs in RAS patients. The accuracy of renal parenchyma segmentation was equal to 0.94 in terms of the intersection over union (IoU) ratio. Accuracy of the cortex segmentation depends on the level of tissue health condition and ranges from 0.76 up to 0.92 of IoU.
EN
Scoliosis is a 3D spinal deformation where the spine takes a lateral curvature, forming an angle in the coronal plane. Diagnosis of scoliosis requires periodic detection, and frequent exposure to radiative imaging may cause cancer. A safer and more economical alternative imaging, i.e., 3D ultrasound imaging modality, is being explored. However, unlike other radiative modalities, an ultrasound image is noisy, which often suppresses the image’s useful information. Through this research, a novel hybridized CNN architecture, multi-scale feature fusion Skip-Inception U-Net (SIU-Net), is proposed for a fully automatic bony feature detection, which can be further used to assess the severity of scoliosis safely and automatically. The proposed architecture, SIU-Net, incorporates two novel features into the basic U-Net architecture: (a) an improvised Inception block and (b) newly designed decoder-side dense skip pathways. The proposed model is tested on 109 spine ultrasound image datasets. The architecture is evaluated using the popular (i) Jaccard Index (ii) Dice Coefficient and (iii) Euclidean distance, and compared with (a) the basic U-net segmentation model, (b) a more evolved UNet++ model, and (c) a newly developed MultiResUNet model. The results show that SIU-Net gives the clearest segmentation output, especially in the important regions of interest such as thoracic and lumbar bony features. The method also gives the highest average Jaccard score of 0.781 and Dice score of 0.883 and the lowest histogram Euclidean distance of 0.011 than the other three models. SIU-Net looks promising to meet the objectives of a fully automatic scoliosis detection system.
18
Content available remote Transfer learning techniques for medical image analysis: A review
EN
Medical imaging is a useful tool for disease detection and diagnostic imaging technology has enabled early diagnosis of medical conditions. Manual image analysis methods are labor-intense and they are susceptible to intra as well as inter-observer variability. Automated medical image analysis techniques can overcome these limitations. In this review, we investigated Transfer Learning (TL) architectures for automated medical image analysis. We discovered that TL has been applied to a wide range of medical imaging tasks, such as segmentation, object identification, disease categorization, severity grading, to name a few. We could establish that TL provides high quality decision support and requires less training data when compared to traditional deep learning methods. These advantageous properties arise from the fact that TL models have already been trained on large generic datasets and a task specific dataset is only used to customize the model. This eliminates the need to train the models from scratch. Our review shows that AlexNet, ResNet, VGGNet, and GoogleNet are the most widely used TL models for medical image analysis. We found that these models can understand medical images, and the customization refines the ability, making these TL models useful tools for medical image analysis.
19
Content available remote The quantitative application of channel importance in movement intention decoding
EN
The complex brain network consists of multiple collaborative regions, which can be activated to varying degrees by motor imagery (MI) and the induced electroencephalogram (EEG) recorded by an array of scalp electrodes is usually decoded for driving rehabilitation system. Either all channels or partially selected channels are equally applied to recognize movement intention, which may be incompatible with the individual differences of channels from different locations. In this paper, a channel importance based imaging method is proposed, denoted as CIBI. For each electrode of MI-EEG, the power over 8–30 Hz band is calculated from discrete Fourier spectrum and input to random forest algorithm (RF) to quantify its contribution, namely channel importance (CI); Then, CI is used for weighting the powers of α and β rhythms, which are interpolated to a 32 x 32 grid by using Clough-Tocher method respectively, generating two main band images with time-frequency-space information. In addition, a dual branch fusion convolutional neural network (DBFCNN) is developed to match with the characteristic of two MI images, realizing the extraction, fusion and classification of comprehensive features. Extensive experiments are conducted based on two public datasets with four classes of MI-EEG, the relatively higher average accuracies are obtained, and the improvements achieve 23:95% and 25:14% respectively when using channel importance, their statistical analysis are also performed by Kappa value, confusion matrix and receiver operating characteristic. Experiment results show that the personalized channel importance is helpful to enhance inter-class separability as well as the proposed method has the outstanding decoding ability for multiple MI tasks.
EN
In the clinics, mammogram masses appear as asymmetric structures between the left and right breasts. In this paper, we design a bilateral image analysis method based on convolutional neural network which can detect and classify breast mass regions simultaneously. It mainly consists of three parts: a feature similarity based region matching technique, mass region of interest (ROI) selection step and a deep metric learning based classifier. Firstly, discriminative score maps are calculated relied on the deep features extracted from bilateral left and right mammograms respectively in global or local spatial image domain. The contralateral correspondences are determined by minimum discriminative scores. Secondly, to select the mass candidate ROIs and further remove false positive mass-tonormal pairs, we propose a dynamic histogram weighting mechanism with three new constrains imposed on the distribution of discriminative score histogram. In addition, a novel soft label based deep metric learning regularization is designed for mass ROI classifier to tackle the large variation of masses in shape, size, texture and breast density. We apply it to the open dataset Digital Database for Screening Mammography. Compared with other state-of-the-art approaches, the proposed scheme gives competitive results in classification and localization tasks for mammographic lesions.
first rewind previous Strona / 3 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.