CN117764917A - Yolov 8-based lung nodule image detection method - Google Patents
Yolov 8-based lung nodule image detection method Download PDFInfo
- Publication number
- CN117764917A CN117764917A CN202311570795.5A CN202311570795A CN117764917A CN 117764917 A CN117764917 A CN 117764917A CN 202311570795 A CN202311570795 A CN 202311570795A CN 117764917 A CN117764917 A CN 117764917A
- Authority
- CN
- China
- Prior art keywords
- yolov8
- formula
- feature map
- lung
- detection method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 35
- 206010056342 Pulmonary mass Diseases 0.000 title abstract description 7
- 230000007246 mechanism Effects 0.000 claims abstract description 36
- 210000004072 lung Anatomy 0.000 claims abstract description 25
- 238000001914 filtration Methods 0.000 claims abstract description 17
- 230000009467 reduction Effects 0.000 claims abstract description 10
- 230000011218 segmentation Effects 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 36
- 230000002685 pulmonary effect Effects 0.000 claims description 33
- 238000011176 pooling Methods 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 11
- 230000004913 activation Effects 0.000 claims description 7
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000012549 training Methods 0.000 abstract description 4
- 238000007781 pre-processing Methods 0.000 abstract 1
- 238000009826 distribution Methods 0.000 description 8
- 238000000034 method Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 201000005202 lung cancer Diseases 0.000 description 3
- 208000020816 lung neoplasm Diseases 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 2
- 210000000988 bone and bone Anatomy 0.000 description 2
- 238000002591 computed tomography Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004204 blood vessel Anatomy 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 238000011976 chest X-ray Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000012631 diagnostic technique Methods 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000003703 image analysis method Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 210000001165 lymph node Anatomy 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 210000001147 pulmonary artery Anatomy 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Landscapes
- Apparatus For Radiation Diagnosis (AREA)
Abstract
Description
技术领域Technical field
本发明涉及图像检测技术领域,具体涉及一种基于YOLOv8的肺结节图像检测方法。The present invention relates to the field of image detection technology, and specifically relates to a YOLOv8-based pulmonary nodule image detection method.
背景技术Background technique
肺癌是一种严重危害人类健康的恶性肿瘤,对其进行早期诊断、及时干预是改善其临床疗效的关键。当前,肺结节的诊断技术有CT、X射线胸片和核磁等。其中,CT因其分辨率高、敏感性高和特异性强而被广泛应用于临床。传统的肺结节检测方法主要依赖于医生的经验和肉眼观察,存在诊断时间长、效率低、易出错等问题。随着计算机及影像处理技术的进步,通过对CT影像的自动化或半自动化分析,可以对肺结节进行自动识别与定位,从而极大地提升了肺癌的早期筛查效率与准确度。Lung cancer is a malignant tumor that seriously endangers human health. Early diagnosis and timely intervention are the keys to improving its clinical efficacy. Currently, the diagnostic techniques for pulmonary nodules include CT, chest X-ray and MRI. Among them, CT is widely used in clinical practice due to its high resolution, high sensitivity and strong specificity. Traditional pulmonary nodule detection methods mainly rely on doctors’ experience and visual observation, which have problems such as long diagnosis time, low efficiency, and error-prone. With the advancement of computer and image processing technology, pulmonary nodules can be automatically identified and located through automated or semi-automated analysis of CT images, thus greatly improving the efficiency and accuracy of early screening for lung cancer.
近年来,基于深度学习的医疗影像分析方法得到了广泛的应用。在这些研究中,利用卷积神经网络进行肺结节识别是目前最主要的一种方法。传统的基于卷积神经网络的对象检测方法主要有Faster R-CNN,YOLO,SSD等。这些算法可以通过训练模型实现对肺结节的自动检测和定位,大大提高肺癌早期筛查和诊断的准确性和可靠性。其中,YOLO技术是一种高效、快速、精确的检测方法,但仍面临诸多问题与挑战。比如,很难发现小的肺结节;肺动脉、淋巴结和骨骼等影响容易引起漏诊或误诊;在提高检测速度的情况下,其准确度也有所下降。因此,如何对其进行建模与算法的改进,以提升其准确性与稳健性是亟待解决的问题。In recent years, medical image analysis methods based on deep learning have been widely used. Among these studies, the use of convolutional neural networks for pulmonary nodule identification is currently the most important method. Traditional object detection methods based on convolutional neural networks mainly include Faster R-CNN, YOLO, SSD, etc. These algorithms can automatically detect and locate pulmonary nodules through training models, greatly improving the accuracy and reliability of early screening and diagnosis of lung cancer. Among them, YOLO technology is an efficient, fast and accurate detection method, but it still faces many problems and challenges. For example, it is difficult to detect small pulmonary nodules; the impact of pulmonary arteries, lymph nodes, and bones can easily lead to missed diagnosis or misdiagnosis; while the detection speed is increased, its accuracy has also declined. Therefore, how to improve its modeling and algorithm to improve its accuracy and robustness is an issue that needs to be solved urgently.
发明内容Contents of the invention
本发明提供了一种基于YOLOv8的肺结节图像检测方法,以解决现有技术中较小的肺结节容易被遗漏、准确度低、检测速度慢的技术问题。The present invention provides a YOLOv8-based pulmonary nodule image detection method to solve the technical problems in the prior art that smaller pulmonary nodules are easily missed, have low accuracy, and have slow detection speed.
本发明提供了一种基于YOLOv8的肺结节图像检测方法,包括如下步骤:The present invention provides a YOLOv8-based pulmonary nodule image detection method, which includes the following steps:
步骤1:获取肺部图像数据集,并对肺部图像数据集进行预处理;Step 1: Obtain the lung image data set and preprocess the lung image data set;
步骤2:对步骤S1预处理好的肺部图像进行肺实质分割;Step 2: Perform lung parenchyma segmentation on the lung image preprocessed in step S1;
步骤3:在YOLOv8模型的主干网络中的C2f卷积模块之后增加NR-SCA模块,其中,NR-SCA模块包括:高斯滤波器、坐标注意力机制、空间注意力机制;输入NR-SCA模块的数据经高斯滤波器后分别进入坐标注意力机制、空间注意力机制,坐标注意力机制、空间注意力机制的输出分别与高斯滤波器的输出进行元素乘积,乘积结果进行拼接融合作为NR-SCA模块的输出;Step 3: Add the NR-SCA module after the C2f convolution module in the backbone network of the YOLOv8 model. The NR-SCA module includes: Gaussian filter, coordinate attention mechanism, and spatial attention mechanism; input to the NR-SCA module After the data is passed through the Gaussian filter, it enters the coordinate attention mechanism and the spatial attention mechanism respectively. The outputs of the coordinate attention mechanism and the spatial attention mechanism are element-wise multiplied with the output of the Gaussian filter. The product results are spliced and fused as the NR-SCA module. Output;
步骤4:将NWD损失函数的尺寸相似度公式和CIOU损失函数的IOU相似度公式加权后作为YOLOv8模型的损失函数;Step 4: Weight the size similarity formula of the NWD loss function and the IOU similarity formula of the CIOU loss function as the loss function of the YOLOv8 model;
步骤5:通过步骤2获取的肺部图像对构建的YOLOv8模型进行训练,并通过训练好的Yolov8模型对肺结节图像进行检测。Step 5: Use the lung images obtained in step 2 to train the constructed YOLOv8 model, and use the trained Yolov8 model to detect pulmonary nodule images.
进一步地,所述步骤3中,NR-SCA模块的输出为:Further, in step 3, the output of the NR-SCA module is:
式中,为高斯滤波降噪后的特征图;yc为坐标注意力机制输出的特征图;ys为空间注意力机制输出的特征图;⊙为逐元素乘积。In the formula, is the feature map after noise reduction by Gaussian filtering; y c is the feature map output by the coordinate attention mechanism; y s is the feature map output by the spatial attention mechanism; ⊙ is the element-wise product.
进一步地,所述步骤3中,高斯滤波器的输出为:Further, in step 3, the output of the Gaussian filter is:
式中,为滤波后输出特征图中位置(i,j)处的像素;Xu,v,k为输入特征图中位置(u,v)处的像素值;k为特征图通道数;(u,v)为滤波器的位置偏移,(i,j)为输出特征图的位置偏移;σ是高斯滤波的标准差;In the formula, is the pixel at position (i, j) in the output feature map after filtering; X u, v, k is the pixel value at position (u, v) in the input feature map; k is the number of feature map channels; (u, v ) is the position offset of the filter, (i,j) is the position offset of the output feature map; σ is the standard deviation of Gaussian filtering;
进一步地,所述步骤3中,坐标注意力机制的输出为:Further, in step 3, the output of the coordinate attention mechanism is:
式中,Xc(i,j)为输入特征图在通道c、垂直位置i和水平位置j上的值;为在垂直方向上的通道注意力权重;/>为在水平方向上的通道注意力权重;In the formula, X c (i, j) is the value of the input feature map on channel c, vertical position i and horizontal position j; is the channel attention weight in the vertical direction;/> is the channel attention weight in the horizontal direction;
进一步地,所述步骤3中,空间注意力机制的输出为:Further, in step 3, the output of the spatial attention mechanism is:
式中,为将平均池化特征图和最大池化特征图按照通道维度进行拼接,得到的特征图;f7*7为对拼接后的特征图进行7×7的卷积操作;σ为激活函数。In the formula, It is the feature map obtained by splicing the average pooling feature map and the maximum pooling feature map according to the channel dimension; f 7*7 is a 7×7 convolution operation on the spliced feature map; σ is the activation function.
进一步地,所述步骤3中还包括:YOLOv8模型的Neck部分为BiFPN结构。Further, step 3 also includes: the Neck part of the YOLOv8 model is a BiFPN structure.
进一步地,所述步骤4中,构建的YOLOv8模型的损失函数为:Further, in step 4, the loss function of the constructed YOLOv8 model is:
式中,SIOU(Na,Nb)表示Na和Nb的IOU相似度;Wθ(Na,Nb)表示Na和Nb的尺寸相似度;C为非线性变换函数的参数;α和β为调节参数,α取值为(0,1),β取值为(1,+∞)。In the formula, S IOU (N a , N b ) represents the IOU similarity between Na and N b ; W θ (N a , N b ) represents the size similarity between Na and N b ; C is the nonlinear transformation function. Parameters; α and β are adjustment parameters, the value of α is (0,1), and the value of β is (1, +∞).
进一步地,所述步骤4中还包括对NWD损失函数的尺寸相似度公式进行优化,优化后的尺寸相似度公式为:Further, step 4 also includes optimizing the size similarity formula of the NWD loss function. The optimized size similarity formula is:
式中,hi和wi分别为第i个特征尺度的高度和宽度;θ是正常数,为尺寸相似度的衰减程度;C为尺寸相似度的基线,即非线性变换函数的参数;λi是特征尺度的权重。In the formula, h i and w i are the height and width of the i-th feature scale respectively; θ is a positive constant, which is the attenuation degree of size similarity; C is the baseline of size similarity, which is the parameter of the nonlinear transformation function; λ i is the weight of the feature scale.
进一步地,所述步骤4中还包括对CIOU损失函数的IOU相似度公式进行优化,优化后的IOU相似度公式为:Further, step 4 also includes optimizing the IOU similarity formula of the CIOU loss function. The optimized IOU similarity formula is:
式中,Area(Na∩Nb)为两个边界框的交集面积;Area(Na∪Nb)为两个边界框的并集面积。In the formula, Area(N a ∩N b ) is the intersection area of the two bounding boxes; Area(N a ∪N b ) is the union area of the two bounding boxes.
进一步地,所述步骤4中还包括对非线性变换函数的参数进行优化,优化后非线性变换函数的参数的公式为:Further, step 4 also includes optimizing the parameters of the nonlinear transformation function. The formula of the parameters of the optimized nonlinear transformation function is:
式中,N为特征尺度的数量;hi为第i个边界框的高度;wi为第i个边界框的宽度。In the formula, N is the number of feature scales; h i is the height of the i-th bounding box; w i is the width of the i-th bounding box.
本发明的有益效果:Beneficial effects of the present invention:
本发明提出一种降噪空间坐标注意力(NR-SCA)机制,通过将高斯滤波降噪技术、空间注意力和坐标注意力这三种机制组合到一个模块中,可以在同时考虑噪声降低、空间位置和结节坐标信息的同时,实现更好的性能。上述组合使模型能够更有效地检测不同位置和尺寸的肺结节,提高检测的精确性和鲁棒性。The present invention proposes a noise reduction spatial coordinate attention (NR-SCA) mechanism. By combining the three mechanisms of Gaussian filter noise reduction technology, spatial attention and coordinate attention into one module, noise reduction, noise reduction, and coordinate attention can be considered at the same time. spatial location and nodule coordinate information simultaneously to achieve better performance. The above combination enables the model to more effectively detect pulmonary nodules in different locations and sizes, improving detection accuracy and robustness.
本发明使用双向加权特征金字塔网络可以简化网络参数并使特征融合更精确。The present invention uses a bidirectional weighted feature pyramid network to simplify network parameters and make feature fusion more accurate.
本发明结合尺寸相似度和IOU相似度构建的SNWD损失函数,更全面地考虑了目标检测中物体的尺寸和位置信息。尺寸相似度有助于处理不同尺寸物体的检测,而IOU相似度有助于准确定位物体的位置。将它们结合起来,可以更好地权衡检测精度和位置准确度,提高对小目标结节的检测效率。The present invention combines the SNWD loss function constructed by size similarity and IOU similarity to more comprehensively consider the size and position information of objects in target detection. Dimensional similarity helps to handle the detection of objects of different sizes, while IOU similarity helps to accurately locate the location of objects. Combining them can better balance detection accuracy and location accuracy, and improve the detection efficiency of small target nodules.
附图说明Description of the drawings
通过参考附图会更加清楚的理解本发明的特征和优点,附图是示意性的而不应理解为对本发明进行任何限制,在附图中:The features and advantages of the present invention will be more clearly understood by referring to the accompanying drawings, which are schematic and should not be construed as limiting the invention in any way, in which:
图1为本发明YOLOv8模型结构示意图;Figure 1 is a schematic structural diagram of the YOLOv8 model of the present invention;
图2为本发明具体实施例的流程图;Figure 2 is a flow chart of a specific embodiment of the present invention;
图3为本发明具体实施例中NR-SCA模块结构示意图;Figure 3 is a schematic structural diagram of the NR-SCA module in a specific embodiment of the present invention;
图4为本发明具体实施例中双向加权特征金字塔(BiFPN)结构示意图。Figure 4 is a schematic structural diagram of a bidirectional weighted feature pyramid (BiFPN) in a specific embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts fall within the scope of protection of the present invention.
本发明实施例提供一种基于YOLOv8的肺结节图像检测方法,具体流程如图2所示,包括如下步骤:The embodiment of the present invention provides a YOLOv8-based pulmonary nodule image detection method. The specific process is shown in Figure 2, including the following steps:
步骤1:获取肺部图像数据集,并对肺部图像数据集进行预处理;Step 1: Obtain the lung image data set and preprocess the lung image data set;
采用的肺部图像数据集为LIDC-IDRI。该数据集是一个公共的肺部医学影像数据集,包含1018个病人的CT扫描图像和与之对应的肺部结节病变标注结果,标注结果由四位经验丰富的医学专家在两个阶段中完成。基于空气、水和X射线的线性衰减系数,对肺部CT扫描数据进行格式转换,将肺CT扫描数据转换成JPG格式。The lung image data set used is LIDC-IDRI. This data set is a public pulmonary medical imaging data set, containing 1018 patients' CT scan images and corresponding pulmonary nodule lesion annotation results. The annotation results were analyzed by four experienced medical experts in two stages. Finish. Based on the linear attenuation coefficients of air, water and X-rays, the lung CT scan data is formatted and converted into JPG format.
步骤2:对步骤S1预处理好的肺部图像进行肺实质分割;Step 2: Perform lung parenchyma segmentation on the lung image preprocessed in step S1;
由于U-Net分割网络结构简单缺乏对局部细节特征的提取,会损失细小的目标边界信息而导致分割精度下降,所以本发明采用结合注意力机制的卷积神经网络(AttentionU-Net)进行肺实质分割,使网络学习更专注感兴趣的区域,增强了模型的灵敏度和准确性,让模型在不同场景下都能较好地识别目标边界,其分割结果可表示为:Since the U-Net segmentation network has a simple structure and lacks the extraction of local detailed features, it will lose fine target boundary information and lead to a decrease in segmentation accuracy. Therefore, the present invention uses a convolutional neural network (AttentionU-Net) combined with an attention mechanism to perform lung parenchyma analysis. Segmentation enables network learning to focus more on the area of interest, enhances the sensitivity and accuracy of the model, and allows the model to better identify target boundaries in different scenarios. The segmentation result can be expressed as:
S(X)=Conv(Attention(X,C(E(X)))+D(C(E(X))),W)S(X)=Conv(Attention(X,C(E(X)))+D(C(E(X))),W)
式中,S(X)表示分割网络最终输出;Attention(X,C(E(X)))表示注意力机制的输出;Conv表示最终分割层的卷积;D(C(E(X)))表示解码器快的输出;W表示卷积核。In the formula, S(X) represents the final output of the segmentation network; Attention(X,C(E(X))) represents the output of the attention mechanism; Conv represents the convolution of the final segmentation layer; D(C(E(X)) ) represents the fast output of the decoder; W represents the convolution kernel.
步骤3:在YOLOv8模型的主干网络中的C2f卷积模块之后增加NR-SCA模块,YOLOv8模型的Neck部分为BiFPN结构,其中,NR-SCA模块包括:高斯滤波器、坐标注意力机制、空间注意力机制;输入NR-SCA模块的数据经高斯滤波器后分别进入坐标注意力机制、空间注意力机制,坐标注意力机制、空间注意力机制的输出分别与高斯滤波器的输出进行元素乘积,乘积结果进行拼接融合作为NR-SCA模块的输出;Step 3: Add the NR-SCA module after the C2f convolution module in the backbone network of the YOLOv8 model. The Neck part of the YOLOv8 model is a BiFPN structure. Among them, the NR-SCA module includes: Gaussian filter, coordinate attention mechanism, spatial attention Force mechanism; the data input to the NR-SCA module enters the coordinate attention mechanism and the spatial attention mechanism respectively after passing through the Gaussian filter. The outputs of the coordinate attention mechanism and the spatial attention mechanism are element-wise multiplied with the output of the Gaussian filter respectively. The product The results are spliced and fused as the output of the NR-SCA module;
NR-SCA模块的输出为:The output of the NR-SCA module is:
式中,为高斯滤波降噪后的特征图;yc为坐标注意力机制输出的特征图;ys为空间注意力机制输出的特征图;⊙为逐元素乘积。In the formula, is the feature map after noise reduction by Gaussian filtering; y c is the feature map output by the coordinate attention mechanism; y s is the feature map output by the spatial attention mechanism; ⊙ is the element-wise product.
其中,高斯滤波器的输出为:Among them, the output of the Gaussian filter is:
式中,为滤波后输出特征图中位置(i,j)处的像素;Xu,v,k为输入特征图中位置(u,v)处的像素值;k为特征图通道数;(u,v)为滤波器的位置偏移,(i,j)为输出特征图的位置偏移;σ是高斯滤波的标准差;In the formula, is the pixel at position (i, j) in the output feature map after filtering; X u, v, k is the pixel value at position (u, v) in the input feature map; k is the number of feature map channels; (u, v ) is the position offset of the filter, (i,j) is the position offset of the output feature map; σ is the standard deviation of Gaussian filtering;
其中,坐标注意力机制的输出为:Among them, the output of the coordinate attention mechanism is:
式中,Xc(i,j)为输入特征图在通道c、垂直位置i和水平位置j上的值;为在垂直方向上的通道注意力权重;/>为在水平方向上的通道注意力权重;In the formula, X c (i, j) is the value of the input feature map on channel c, vertical position i and horizontal position j; is the channel attention weight in the vertical direction;/> is the channel attention weight in the horizontal direction;
其中,空间注意力机制的输出为:Among them, the output of the spatial attention mechanism is:
式中,/>为将平均池化特征图和最大池化特征图按照通道维度进行拼接,得到的特征图; In the formula,/> It is the feature map obtained by splicing the average pooling feature map and the maximum pooling feature map according to the channel dimension;
f7*7为对拼接后的特征图进行7×7的卷积操作;σ为激活函数。f 7*7 is a 7×7 convolution operation on the spliced feature map; σ is the activation function.
在主干网络中添加NR-SCA注意力聚焦特征图中的关键特征,与单个坐标注意力或空间注意力相比,通过联合考虑降噪、空间和坐标信息,能够更全面地处理肺结节检测任务,提高检测性能,感知能力,并简化训练和优化过程,有助于实现更准确和可靠的肺结节检测。具体结构如图3所示,包括:Adding key features in the NR-SCA attention focused feature map to the backbone network can handle lung nodule detection more comprehensively by jointly considering noise reduction, spatial and coordinate information compared to single coordinate attention or spatial attention. tasks, improve detection performance, perception capabilities, and simplify the training and optimization process, helping to achieve more accurate and reliable pulmonary nodule detection. The specific structure is shown in Figure 3, including:
step1:对输入特征图进行高斯滤波去噪。肺部CT图像中存在血管、骨骼等杂质影响,使用高斯滤波有助于肺部CT图像减少噪音,平滑图像并突出肺结节,具体公式如下:step1: Perform Gaussian filtering and denoising on the input feature map. There are impurities such as blood vessels and bones in lung CT images. Using Gaussian filter can help reduce noise in lung CT images, smooth the image and highlight pulmonary nodules. The specific formula is as follows:
式中,Xi,j,k表示滤波后输出特征图中位置(i,j)处的像素;Xu,v,k表示输入特征图中位置(u,v)处的像素值;k表示特征图通道数;(u,v)表示滤波器的位置偏移;(i,j)表示输出特征图的位置偏移;σ是高斯滤波的标准差。在对特征图进行高斯滤波时,该核会在图像的每个像素周围移动,对周围像素进行加权平均。In the formula, X i, j, k represents the pixel at position (i, j) in the output feature map after filtering; X u, v, k represents the pixel value at position (u, v) in the input feature map; k represents The number of feature map channels; (u, v) represents the position offset of the filter; (i, j) represents the position offset of the output feature map; σ is the standard deviation of Gaussian filtering. When Gaussian filtering is performed on the feature map, the kernel moves around each pixel of the image and performs a weighted average of surrounding pixels.
Step2:将经过高斯滤波处理后的特征图输入坐标注意力分支,为了提高结节定位的准确性,并通过精确的定位信息来捕捉远程空间互动,避免空间信息全部压缩到通道中,将按下面的公式来分解全局平均池化,具体如下所示:Step2: Input the feature map processed by Gaussian filtering into the coordinate attention branch. In order to improve the accuracy of nodule positioning and capture long-range spatial interaction through precise positioning information to avoid all spatial information being compressed into the channel, press the following The formula to decompose global average pooling is as follows:
式中,Xc(h,i)和Xc(j,w)表示特征图第c个通道对应位置的像素值;W表示特征图宽度;H表示特征图高度,对尺寸C×H×W输入特征图在W和H方向上做平均池化,分别得到尺寸为C×H×1和C×1×W的特征图。然后将生成的C×1×W的特征图进行变换,然后进行concat操作。公式如下所示:In the formula, X c (h, i) and X c (j, w) represent the pixel value at the corresponding position of the c-th channel of the feature map; W represents the width of the feature map; The input feature map is average pooled in the W and H directions to obtain feature maps with sizes C×H×1 and C×1×W respectively. Then the generated C×1×W feature map is transformed, and then the concat operation is performed. The formula looks like this:
f=δ(F1([zh,zw]))f=δ(F 1 ([z h ,z w ]))
式中,[zh,zw]表示将在垂直方向上的平均值zh和在水平方向上的平均值zw连接在一起,形成一个新的特征向量;F1表示神经网络的前向传播;δ表示激活函数,可以增加神经网络的表达能力;f表示经过神经网络计算后得到的最终特征向量。In the formula, [z h , z w ] means connecting the average z h in the vertical direction and the average z w in the horizontal direction to form a new feature vector; F 1 represents the forward direction of the neural network Propagation; δ represents the activation function, which can increase the expression ability of the neural network; f represents the final feature vector obtained after calculation by the neural network.
Step3:坐标注意力可以获取水平和垂直两个方向的空间信息,将位置信息融合到注意力机制中,以便模型可以根据肺结节的位置来增强或抑制特征。沿着空间维度,将f进行split分离,接着分成fh∈Rc/r×H×1和fw∈Rc/r×1×W,然后分别利用1×1卷积进行升维度操作,结合sigmoid激活函数得到最后的注意力向量gh∈Rc×H×1和gw∈Rc×1×W,公式如下:Step3: Coordinate attention can obtain spatial information in both horizontal and vertical directions, and integrate position information into the attention mechanism so that the model can enhance or suppress features based on the location of the lung nodule. Along the spatial dimension, f is split and separated, and then divided into f h ∈R c/r×H×1 and f w ∈R c/r×1×W , and then 1×1 convolution is used to perform dimension-raising operations respectively. Combined with the sigmoid activation function, the final attention vectors g h ∈R c×H×1 and g w ∈R c×1×W are obtained. The formula is as follows:
gh=σ(Fh(fh))g h =σ(F h (f h ))
gw=σ(Fwfw))g w =σ(F w f w ))
式中,fh和fw分别表示垂直和水平方向的输入特征向量,F表示神经网络的前向传播,σ表示激活函数,gh和gw表示经过神经网络计算后垂直和水平方向的最终特征向量。然后对gh和gw进行拓展,坐标注意力分支的最终输出可以表示如下式:In the formula, f h and f w represent the input feature vectors in the vertical and horizontal directions respectively, F represents the forward propagation of the neural network, σ represents the activation function, g h and g w represent the final vertical and horizontal directions after calculation by the neural network. Feature vector. Then g h and g w are expanded, and the final output of the coordinate attention branch can be expressed as follows:
式中,表示输入特征图在通道c、垂直位置i和水平位置j上的值,/>表示在垂直方向上的通道注意力权重,/>表示在水平方向上的通道注意力权重。In the formula, Represents the value of the input feature map on channel c, vertical position i and horizontal position j, /> Represents the channel attention weight in the vertical direction, /> Represents the channel attention weight in the horizontal direction.
Step4:空间注意力可以学习每个通道的权重分布,从而增强对肺结节形态和特征的感知,提高模型的特征表达能力。将经过高斯滤波处理后的特征图输入空间注意力分支,通过最大池化和平均池化得到两个1×H×W的特征图,然后经过Concat操作对两个特征图进行拼接,通过7×7卷积变为1通道的特征图,再经过一个sigmoid得到空间注意力的特征图,具体公式如下:Step 4: Spatial attention can learn the weight distribution of each channel, thereby enhancing the perception of the morphology and characteristics of pulmonary nodules and improving the feature expression ability of the model. The feature map processed by Gaussian filtering is input into the spatial attention branch, and two 1×H×W feature maps are obtained through maximum pooling and average pooling. The two feature maps are then spliced through the Concat operation, and 7× The 7-channel convolution becomes a 1-channel feature map, and then a sigmoid is used to obtain the spatial attention feature map. The specific formula is as follows:
式中,表示将平均池化特征图和最大池化特征图按照通道维度进行拼接,得到一个更加丰富的特征图;f7*7表示对拼接后的特征图进行一个7×7的卷积操作;σ表示激活函数。In the formula, It means splicing the average pooling feature map and the maximum pooling feature map according to the channel dimension to obtain a richer feature map; f 7*7 means performing a 7×7 convolution operation on the spliced feature map; σ means activation function.
Step5:将坐标注意力分支和空间注意力分支输出的特征图分别与原始高斯降噪后的特征图进行元素乘积,得到乘积特征图再进行通道维度上的拼接融合,融合后的NR-SCA注意力可以起到三个模块的协同效应,降低模块间多次前向传播的信息流失,保留全局信息,得到最终的输出特征图,即:Step5: Perform element-wise multiplication of the feature maps output by the coordinate attention branch and spatial attention branch with the original Gaussian denoised feature map to obtain the product feature map and then perform splicing and fusion in the channel dimension. The fused NR-SCA attention The force can play a synergistic effect among the three modules, reduce the information loss of multiple forward propagations between modules, retain global information, and obtain the final output feature map, that is:
式中,表示高斯滤波降噪后的特征图,yc表示坐标注意力分支输出的特征图,ys表示空间注意力分支输出的特征图,⊙表示逐元素乘积。In the formula, represents the feature map after Gaussian filtering and denoising, y c represents the feature map output by the coordinate attention branch, y s represents the feature map output by the spatial attention branch, and ⊙ represents the element-wise product.
将原PANet模块替换为BiFPN模块,能够对不同分辨率的输入特征图的不同贡献进行加权融合。BiFPN模块删除了只有一条输入边的节点,并在同一层的原始输入节点和输出节点之间额外增加一条边来融合更多特征;处理每个自顶向下和自底而上的路径作为一个特征网络层,并重复同一层多次,以实现更高层次的特征融合,结构如图4所示。Replacing the original PANet module with the BiFPN module can perform weighted fusion of different contributions of input feature maps of different resolutions. The BiFPN module removes nodes with only one input edge and adds an extra edge between the original input node and the output node of the same layer to fuse more features; processes each top-down and bottom-up path as a Feature network layer, and repeat the same layer multiple times to achieve higher-level feature fusion. The structure is shown in Figure 4.
BiFPN中使用加权特征融合方法将范围放缩到[0,1]之间,使训练速度快效率高,表达式如下:The weighted feature fusion method is used in BiFPN to scale the range to [0,1], making the training fast and efficient. The expression is as follows:
式中,wi表示权重参数,Ii表示输入的特征图,O表示融合后的特征图,∈是很小的常数用于防止分母为零。In the formula, w i represents the weight parameter, I i represents the input feature map, O represents the fused feature map, and ∈ is a small constant used to prevent the denominator from being zero.
步骤4:将NWD损失函数的尺寸相似度公式和CIOU损失函数的IOU相似度公式加权后作为YOLOv8模型的损失函数;Step 4: Weight the size similarity formula of the NWD loss function and the IOU similarity formula of the CIOU loss function as the loss function of the YOLOv8 model;
构建的YOLOv8模型的损失函数为:The loss function of the constructed YOLOv8 model is:
其中,IOU相似度公式为:Among them, the IOU similarity formula is:
式中,Area(Na∩Nb)为两个边界框的交集面积;Area(Na∪Nb)为两个边界框的并集面积;In the formula, Area(N a ∩N b ) is the intersection area of two bounding boxes; Area(N a ∪N b ) is the union area of two bounding boxes;
尺寸相似度公式为:The size similarity formula is:
式中,hi和wi分别为第i个特征尺度的高度和宽度;C和θ是正常数,分别为尺寸相似度的基线和衰减程度;λi是特征尺度的权重;In the formula, h i and w i are the height and width of the i-th feature scale respectively; C and θ are positive constants, which are the baseline and attenuation degree of size similarity respectively; λ i is the weight of the feature scale;
非线性变换函数的参数的公式为:The formula for the parameters of the nonlinear transformation function is:
式中,N为特征尺度的数量;hi为第i个边界框的高度;wi为第i个边界框的宽度。In the formula, N is the number of feature scales; h i is the height of the i-th bounding box; w i is the width of the i-th bounding box.
SNWD把包围框建模成一个2D的高斯分布,然后用Wasserstein距离来度量这两个高斯分布之间的相似度,为平衡检测不同尺寸结节的鲁棒性和适用性,将相似度分为IOU相似度和尺寸相似度两部分按不同比重输入到非线性调整函数中,并用平均绝对尺寸调整指数非线性变换函数,以提高对小目标结节的检测效率,具体来说:SNWD models the bounding box as a 2D Gaussian distribution, and then uses Wasserstein distance to measure the similarity between the two Gaussian distributions. In order to balance the robustness and applicability of detecting nodules of different sizes, the similarity is divided into The two parts of IOU similarity and size similarity are input into the nonlinear adjustment function according to different proportions, and the average absolute size is used to adjust the exponential nonlinear transformation function to improve the detection efficiency of small target nodules. Specifically:
step1针对小目标,其包围框内的前景像素往往分布在中央,而背景像素则分布在边缘。为了更好地对包围框内的每个像素进行加权,可以将包围框建模成一个二维高斯分布,公式如下所示:In step 1, for small targets, the foreground pixels within the bounding box are often distributed in the center, while the background pixels are distributed at the edges. In order to better weight each pixel within the bounding box, the bounding box can be modeled as a two-dimensional Gaussian distribution, the formula is as follows:
式中,Cx,Cy,w,h分别表示为水平包围框的中心坐标、宽和高。In the formula, C x , C y , w, h represent the center coordinates, width and height of the horizontal bounding box respectively.
Step2使用最优传输理论中的Wasserstein距离来计算两个分布的距离,并用归一化后的指数来得到一个新的度量,其2阶Wasserstein距离和归一化的Wasserstein距离分别定义为:Step 2 uses the Wasserstein distance in optimal transmission theory to calculate the distance between the two distributions, and uses the normalized index to obtain a new metric. Its second-order Wasserstein distance and normalized Wasserstein distance are respectively defined as:
式中,μ1和μ2表示两个高斯分布,m1和m2表示两个分布的均值,和/>表示两个分布的协方差矩阵的平方根,||·||2表示欧氏距离,NWD(Na,Nb)表示归一化的Wasserstein距离,C表示与数据集有关的正常数。In the formula, μ 1 and μ 2 represent two Gaussian distributions, m 1 and m 2 represent the means of the two distributions, and/> represents the square root of the covariance matrix of the two distributions, ||·|| 2 represents the Euclidean distance, NWD(N a , N b ) represents the normalized Wasserstein distance, and C represents the positive constant related to the data set.
Step3在NWD方法中,只考虑了边界框的中心位置和宽度高度两个特征。然而,实际上对于肺结节图像来说边界框中可能还包含其他特征,例如纹理、颜色、形状等等。因此,可以考虑在不同的特征尺度下计算边界框之间的相似度,更全面地描述它们之间的关系,其IOU相似度表达式和尺寸相似度表达式分别为:Step 3 In the NWD method, only the center position and width and height of the bounding box are considered. However, in fact, for lung nodule images, the bounding box may also contain other features, such as texture, color, shape, etc. Therefore, we can consider calculating the similarity between bounding boxes at different feature scales to describe the relationship between them more comprehensively. The IOU similarity expression and size similarity expression are respectively:
式中,Area(Na∩Nb)表示两个边界框的交集面积,Area(Na∪Nb)表示两个边界框的并集面积,hi和wi分别表示第i个特征尺度的高度和宽度,C和θ是正常数,分别表示尺寸相似度的基线和衰减程度,λi是特征尺度的权重,表示该位置对于计算尺寸相似度的贡献程度。In the formula, Area(N a ∩N b ) represents the intersection area of two bounding boxes, Area(N a ∪N b ) represents the union area of two bounding boxes, h i and w i represent the i-th feature scale respectively. The height and width of , C and θ are positive constants, indicating the baseline and attenuation degree of size similarity respectively, λ i is the weight of the feature scale, indicating the contribution of this position to the calculation of size similarity.
Step4目标的尺寸关系对正确的定位和识别至关重要,为了更好的考虑数据集中目标的平均尺寸,使尺寸相似度计算更加合理,以提高模型的泛化能力,通过计算所有边界框的高度和宽度之和的平均值构建新的指数非线性变换函数的参数C,公式如下:Step4 The size relationship of the target is crucial for correct positioning and recognition. In order to better consider the average size of the target in the data set and make the size similarity calculation more reasonable to improve the generalization ability of the model, by calculating the height of all bounding boxes The parameter C of the new exponential nonlinear transformation function is constructed by taking the average of the sum of the width and the width, and the formula is as follows:
式中,N表示特征尺度的数量,hi表示第i个边界框的高度,wi表示第i个边界框的宽度,通过加权计算得到SNWD的表达式:In the formula, N represents the number of feature scales, h i represents the height of the i-th bounding box, w i represents the width of the i-th bounding box, and the expression of SNWD is obtained through weighted calculation:
式中,SIOU(Na,Nb)表示Na和Nb的IOU相似度,Wθ(Na,Nb)表示Na和Nb的尺寸相似度,α和β为调节参数,且α取值为(0,1),β取值为(1,+∞),用于平衡IOU相似度和尺寸相似度之间的影响。In the formula, S IOU (N a , N b ) represents the IOU similarity between Na and N b , W θ (N a , N b ) represents the size similarity between Na and N b , α and β are adjustment parameters, And the value of α is (0,1), and the value of β is (1, +∞), which is used to balance the impact between IOU similarity and size similarity.
步骤5:通过步骤2获取的肺部图像对构建的YOLOv8模型进行训练,并通过训练好的Yolov8模型对肺结节图像进行肺结节的检测识别。Step 5: Use the lung images obtained in step 2 to train the constructed YOLOv8 model, and use the trained Yolov8 model to detect and identify pulmonary nodules on the pulmonary nodule images.
通过改进yolov8模型,结构如图4所示,可以减少噪音干扰,增强坐标注意力对局部特征的关注度,简化融合网络参数,有效提高肺小结节的检测效率,减少漏检率,从而实现肺结节的快速精准检测。By improving the yolov8 model, the structure is shown in Figure 4, it can reduce noise interference, enhance coordinate attention to local features, simplify the fusion network parameters, effectively improve the detection efficiency of small pulmonary nodules, and reduce the missed detection rate, thereby achieving Rapid and accurate detection of pulmonary nodules.
虽然结合附图描述了本发明的实施例,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下作出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention. Such modifications and variations are covered by the appended claims. within the limited scope.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311570795.5A CN117764917A (en) | 2023-11-22 | 2023-11-22 | Yolov 8-based lung nodule image detection method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311570795.5A CN117764917A (en) | 2023-11-22 | 2023-11-22 | Yolov 8-based lung nodule image detection method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117764917A true CN117764917A (en) | 2024-03-26 |
Family
ID=90311313
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311570795.5A Pending CN117764917A (en) | 2023-11-22 | 2023-11-22 | Yolov 8-based lung nodule image detection method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117764917A (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118447023A (en) * | 2024-07-08 | 2024-08-06 | 合肥工业大学 | Method, device and system for detecting embedded part and storage medium |
| CN118552535A (en) * | 2024-07-29 | 2024-08-27 | 华侨大学 | A method for localizing lesions of pneumonia in children based on improved YOLOv8 |
| CN118747804A (en) * | 2024-07-18 | 2024-10-08 | 南通理工学院 | A full-scale lung image segmentation algorithm based on hybrid connection |
| CN119131556A (en) * | 2024-09-12 | 2024-12-13 | 北京建筑大学 | An electric vehicle helmet detection method and system based on improved YOLOv8 algorithm |
-
2023
- 2023-11-22 CN CN202311570795.5A patent/CN117764917A/en active Pending
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118447023A (en) * | 2024-07-08 | 2024-08-06 | 合肥工业大学 | Method, device and system for detecting embedded part and storage medium |
| CN118447023B (en) * | 2024-07-08 | 2024-10-29 | 合肥工业大学 | Method, device and system for detecting embedded part and storage medium |
| CN118747804A (en) * | 2024-07-18 | 2024-10-08 | 南通理工学院 | A full-scale lung image segmentation algorithm based on hybrid connection |
| CN118552535A (en) * | 2024-07-29 | 2024-08-27 | 华侨大学 | A method for localizing lesions of pneumonia in children based on improved YOLOv8 |
| CN118552535B (en) * | 2024-07-29 | 2024-11-22 | 华侨大学 | A method for localizing lesions of pneumonia in children based on improved YOLOv8 |
| CN119131556A (en) * | 2024-09-12 | 2024-12-13 | 北京建筑大学 | An electric vehicle helmet detection method and system based on improved YOLOv8 algorithm |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109523521B (en) | Pulmonary nodule classification and lesion location method and system based on multi-slice CT images | |
| CN117764917A (en) | Yolov 8-based lung nodule image detection method | |
| CN112001218B (en) | A three-dimensional particle category detection method and system based on convolutional neural network | |
| Li et al. | Automated measurement network for accurate segmentation and parameter modification in fetal head ultrasound images | |
| Li et al. | PSENet: Psoriasis severity evaluation network | |
| CN113192124A (en) | Image target positioning method based on twin network | |
| CN116091490A (en) | Lung nodule detection method based on YOLOv4-CA-CBAM-K-means++ -SIOU | |
| Dar et al. | Efficientu-net: a novel deep learning method for breast tumor segmentation and classification in ultrasound images | |
| CN119580985A (en) | Robot-assisted wound treatment method and system based on multimodal image analysis | |
| Awasthi et al. | LVNet: Lightweight model for left ventricle segmentation for short axis views in echocardiographic imaging | |
| CN116934747B (en) | Fundus image segmentation model training method, equipment and glaucoma auxiliary diagnosis system | |
| CN112686952A (en) | Image optical flow computing system, method and application | |
| CN114240829B (en) | An ultrasonic imaging diagnosis method based on artificial intelligence | |
| US20250238932A1 (en) | Building method of multi-modal three-dimensional medical image segmentation and registration model, and application thereof | |
| Wang et al. | Deep learning based fetal middle cerebral artery segmentation in large-scale ultrasound images | |
| CN119887807B (en) | Multi-mode medical image segmentation method and system based on light weight | |
| CN114581698A (en) | Target classification method based on space cross attention mechanism feature fusion | |
| CN114677383B (en) | Pulmonary nodule detection and segmentation method based on multi-task learning | |
| CN119169056B (en) | Satellite video target tracking method, device and equipment based on refined positioning | |
| CN119006742B (en) | Human body three-dimensional reconstruction method and system based on deep learning | |
| Shi et al. | MAST-UNet: More adaptive semantic texture for segmenting pulmonary nodules | |
| Wang et al. | Thinking twice: Clinical-inspired thyroid ultrasound lesion detection based on feature feedback | |
| Ruinan et al. | DPCA-Net: dual path with 3D channel attention for pulmonary nodule detection | |
| Jiang | Trends and Techniques in Medical Image Segmentation for Disease Detection | |
| CN120431409B (en) | Tumor rehabilitation diagnosis and treatment system and method based on artificial intelligence |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |