CN108346145B - Identification method of unconventional cells in pathological section - Google Patents
Identification method of unconventional cells in pathological section Download PDFInfo
- Publication number
- CN108346145B CN108346145B CN201810097641.1A CN201810097641A CN108346145B CN 108346145 B CN108346145 B CN 108346145B CN 201810097641 A CN201810097641 A CN 201810097641A CN 108346145 B CN108346145 B CN 108346145B
- Authority
- CN
- China
- Prior art keywords
- unconventional
- cells
- model
- classification
- unconventional cells
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
本发明公开了一种病理切片中非常规细胞的识别方法,包括:将电子扫描病理切片预处理得到有效判别区域,输入全卷积网络预训练,再使用全连接层代替全卷积网络头部微调网络,使全卷积网络具有提取非常规细胞特征的能力,进而确定非常规细胞位置,更加有效地对有效判别区域进行分类;通过结合多个普通分类网络的预测结果投票,输出更加稳定的分类结果。本发明的识别方法能够自动判别病理切片中每一个20×放大的视野中存在非常规细胞的概率,输出概率值在0.5以上的非常规细胞作为识别结果,大量减轻人工筛查病理切片中非常规细胞的工作量,快速准确的筛选出非常规细胞。
The invention discloses a method for identifying unconventional cells in pathological slices. Fine-tune the network, so that the fully convolutional network has the ability to extract the characteristics of unconventional cells, and then determine the location of unconventional cells, and more effectively classify the effective discriminant area; By combining the prediction results of multiple common classification networks, the output is more stable. Classification results. The identification method of the present invention can automatically determine the probability of the existence of unconventional cells in each 20× magnified field of view in the pathological slice, and output the unconventional cells with a probability value above 0.5 as the identification result, which greatly reduces the need for manual screening of unconventional cells in pathological slices. Cell workload, rapid and accurate screening of unconventional cells.
Description
技术领域technical field
本发明属于医疗影像领域,具体涉及一种病理切片中非常规细胞的识别方法。The invention belongs to the field of medical imaging, and particularly relates to a method for identifying unconventional cells in pathological slices.
背景技术Background technique
传统病理切片中非常规细胞(或称非正常形态细胞)由人工进行筛查:在显微镜下,由专业的病理医生通过切片的移动,进而肉眼扫描整个切片,寻找整个切片是否存在非常规细胞,这种工作繁重而耗时,并且随着阅片时间的增长,错误率也随之提高。Unconventional cells (or abnormal cells) in traditional pathological sections are manually screened: under a microscope, professional pathologists scan the entire section with the naked eye through the movement of the section to find out whether there are unconventional cells in the entire section. This work is tedious and time-consuming, and as the reading time increases, so does the error rate.
随着科技的不断发展,病理切片中非常规细胞的识别可以借助计算机的帮助进行初步筛查。With the continuous development of science and technology, the identification of unconventional cells in pathological slices can be used for preliminary screening with the help of computers.
基于卷积神经网络(Convolutional Neural Network,CNN)算法的VGGNet,ResNet,DenseNet等网络改进结构不断更新迭代的计算机视觉,在自然图像上,其准确率已经超过人眼。图像语义分割(Semantic Segmentation)是计算机视觉的一个重要研究方向,其任务是通过计算机算法完成对单张图像每个像素进行分类。在医疗影像中,语义分割常常被用于分割图像中的器官、组织或细胞等,以便于后续分类。Based on the Convolutional Neural Network (CNN) algorithm, VGGNet, ResNet, DenseNet and other networks improve the structure of computer vision with continuous updating and iteration. On natural images, the accuracy rate has exceeded the human eye. Image semantic segmentation (Semantic Segmentation) is an important research direction of computer vision, and its task is to classify each pixel of a single image through computer algorithms. In medical imaging, semantic segmentation is often used to segment organs, tissues or cells in images for subsequent classification.
Jonathan Long提出在全卷积神经网络(Fully Convolutional Networks,FCN)中使用卷积与反卷积(Deconvolution)进行语义分割任务替代传统的全连接语义分割方法为语义分割模型的主要方法之一,U-Net是一种典型的全卷积网络,其主要思想是将网络分成下采样层与上采样层,上采用使用普通的卷积,池化层,上采样使用双线性插值或反卷积放大特征图是指能够与浅层特征图大小一致并连接。通过浅层的特征图跳跃连接与深层特征图拼接,得到最终的特征图。Jonathan Long proposed to use convolution and deconvolution in fully convolutional neural networks (FCN) to perform semantic segmentation tasks instead of traditional fully connected semantic segmentation methods as one of the main methods of semantic segmentation models. U -Net is a typical fully convolutional network. Its main idea is to divide the network into a downsampling layer and an upsampling layer, using ordinary convolution, pooling layer, and bilinear interpolation or deconvolution for upsampling. Enlarging the feature map refers to being able to be the same size and concatenated with the shallow feature map. The final feature map is obtained by splicing the shallow feature map skip connection with the deep feature map.
发明内容SUMMARY OF THE INVENTION
本发明的目的是提供一种病理切片中非常规细胞的识别方法,大量减轻人工筛查病理切片中非常规细胞这一繁重工作的工作量,快速准确的筛选出非常规细胞。The purpose of the present invention is to provide a method for identifying unconventional cells in pathological slices, greatly reducing the heavy workload of manual screening of unconventional cells in pathological slices, and quickly and accurately screening unconventional cells.
本发明所述的常规细胞为人体正常细胞,非常规细胞与人体正常细胞相对应,为人体非正常形态细胞。The conventional cells of the present invention are normal cells of the human body, and the unconventional cells correspond to the normal cells of the human body and are abnormal morphological cells of the human body.
本发明技术方案的工作原理:The working principle of the technical solution of the present invention:
将电子扫描病理切片进行预处理为多个区域,通过转换为LAB通道后A通道的均值作为依据判断区域是否有效,得到该病理切片中的所有有效判别区域,通过将使用重分布与zscore方法预处理后有效判别区域输入全卷积网络预训练,再使用全连接层代替全卷积网络头部微调网络,能够使全卷积网络具有提取非常规细胞特征的能力,进而确定非常规细胞位置,因此更加有效地对有效判别区域进行分类。通过结合多个普通分类网络的预测结果投票,输出更加稳定的分类结果,能够自动判别切片中每一个20×放大的视野中存在非常规细胞的概率。The electronically scanned pathological section is preprocessed into multiple regions, and the average value of the A channel after conversion to the LAB channel is used to judge whether the region is valid, and all the valid discriminant regions in the pathological section are obtained. After processing, the effective discriminant region is input to the fully convolutional network for pre-training, and then the fully connected layer is used to replace the fully convolutional network head fine-tuning network, which enables the fully convolutional network to have the ability to extract the characteristics of unconventional cells, and then determine the location of unconventional cells. Therefore, the effective discriminant regions are classified more effectively. By combining the prediction results of multiple common classification networks, a more stable classification result is output, and the probability of unconventional cells in each 20× magnified field of view in the slice can be automatically determined.
一种病理切片上非常规细胞的识别方法,包括:A method for identifying unconventional cells on pathological sections, comprising:
(1)对电子扫描病理切片进行预处理,得到该病理切片中的有效判别区域,所述的有效判别区域中非常规细胞像素区域为正样本,常规细胞像素区域为负样本;(1) preprocessing the electronically scanned pathological slice to obtain an effective discriminating area in the pathological slice, wherein the unconventional cell pixel area in the effective discriminating area is a positive sample, and the conventional cell pixel area is a negative sample;
(2)对步骤(1)得到的正、负样本采用全卷积网络算法进行训练,根据模型预测结果与标签的重合度对网络的参数进行调节,得到收敛的切片分割模型;(2) The positive and negative samples obtained in step (1) are trained with a fully convolutional network algorithm, and the parameters of the network are adjusted according to the coincidence of the model prediction result and the label to obtain a convergent slice segmentation model;
(3)在步骤(2)得到的切片分割模型基础上,将其头部分割器替换为分类器,使用含有非常规细胞的判别区域作为正例,完全不含非常规细胞的判别区域作为负例,微调网络参数使之适应分类任务,得到分割预训练分类模型;(3) On the basis of the slice segmentation model obtained in step (2), replace its head segmenter with a classifier, and use the discriminative region containing unconventional cells as a positive example, and the discriminative region without unconventional cells as a negative example For example, fine-tune the network parameters to adapt to the classification task, and obtain a segmentation pre-trained classification model;
(4)将步骤(1)得到的有效判别区域中,含有非常规细胞的判别区域作为正例,完全不含有非常规细胞的判别区域作为负例,使用普通卷积神经网络分类方法中的k折交叉验证的方式训练k个普通分类模型;(4) In the effective discriminant area obtained in step (1), the discriminant area containing unconventional cells is regarded as a positive example, and the discriminative area containing no unconventional cells at all is regarded as a negative example, and k in the ordinary convolutional neural network classification method is used. Fold cross-validation to train k common classification models;
所述k的取值范围为5~10之间的整数;The value range of the k is an integer between 5 and 10;
(5)将步骤(3)得到的分割预训练分类模型与步骤(4)得到的k个普通分类模型通过模型集成的方法融合,构建最终分类模型;(5) merging the segmentation pre-training classification model obtained in step (3) and the k common classification models obtained in step (4) by the method of model integration to construct a final classification model;
(6)将未经过标记的新病理切片,经过步骤(1)处理得到的有效判别区域输入最终分类模型,输出概率值在0.5以上的非常规细胞作为识别结果。(6) Inputting the unmarked new pathological slice and the effective discriminant region obtained by processing in step (1) into the final classification model, and outputting unconventional cells with a probability value above 0.5 as the identification result.
步骤(1)中,所述的预处理步骤为:In step (1), described preprocessing step is:
(1-1)将20×放大的病理切片划分为像素为512*512~2048*2048相同大小的区域,分别存储;(1-1) Divide the 20× magnified pathological slice into areas with the same size as 512*512~2048*2048 pixels, and store them separately;
(1-2)对每个小块转换为LAB通道后的图像,将A通道的均值超过阈值t的小块作为有效判别区域,其余的舍弃;(1-2) For the image after each small block is converted into the LAB channel, the small block whose mean value of the A channel exceeds the threshold t is used as an effective discrimination area, and the rest are discarded;
所述的阈值t为120~150。The threshold t is 120-150.
步骤(1-2)中使用转换至LAB后的A通道均值作为判别有效区域的依据的原因在于:病理切片中组织细胞等有效区域染色后均为紫色或红色,LAB通道中,A通道代表该像素红色的程度,故采用A通道作为判别依据,超过阈值t即认为该区域含有有效组织或细胞。The reason for using the mean value of channel A after conversion to LAB in step (1-2) as the basis for judging the effective area is that the effective areas such as tissue cells in pathological sections are stained purple or red. In the LAB channel, the A channel represents the effective area. The degree of redness of the pixel, so the A channel is used as the judgment basis, and if the threshold t is exceeded, it is considered that the area contains effective tissue or cells.
步骤(2)中,所述的模型预测结果与标签的重合度的评估方法包括Dice Loss、Cross Entropy或Mean Squared Error。In step (2), the method for evaluating the degree of coincidence between the model prediction result and the label includes Dice Loss, Cross Entropy or Mean Squared Error.
步骤(2)中,所述的收敛的切片分割模型的训练方法,其具体步骤为:In step (2), the training method of the described convergent slice segmentation model, its concrete steps are:
(2-1)将输入的有效判别区域通过图像压缩算法将其压缩至像素为256*256~512*512的矩阵;此比率能够保证大部分图像特征的保存,而舍弃一些细小的特征,这些特征在正负样本的分类上贡献较小。(2-1) Compress the input effective discriminant area into a matrix with pixels of 256*256~512*512 through the image compression algorithm; this ratio can ensure the preservation of most image features, while discarding some small features, these Features contribute less to the classification of positive and negative samples.
(2-2)通过重分布与z-score方法将上述矩阵归一化并转换到标准正态分布;(2-2) Normalize and convert the above matrix to standard normal distribution by redistribution and z-score method;
步骤(2-1)得到的图像(像素为256*256~512*512的矩阵)为RGB三通道,为了更好的被神经网络学习,一般需要对其进行重分布与规范化,具体的操作流程是:首先将图像像素除以255,投影至[0,1]区间,然后使用zscore规范化方式对数据进行减去均值除以方差的操作以转换到标准正态分布,zscore计算方法如下:The image obtained in step (2-1) (matrix with pixels ranging from 256*256 to 512*512) is RGB three-channel. In order to be better learned by the neural network, it generally needs to be redistributed and normalized. The specific operation process Yes: first divide the image pixels by 255, project to the [0,1] interval, and then use the zscore normalization method to subtract the mean and divide the variance to convert the data to a standard normal distribution. The zscore calculation method is as follows:
其中zi表示z-score算法的最终输出,xi表示输入数据,表示该特征的平均值,s表示该特征的标准差。where zi represents the final output of the z-score algorithm, xi represents the input data, represents the mean of the feature, and s represents the standard deviation of the feature.
(2-3)对步骤(2-2)转换后得到的标准正态分布的图像使用数据增强(DataAugmentation)技术进行旋转,翻转,镜像,亮度改变,随机偏移等操作,以使网络能够学习到不同方向,角度的特征,同时降低网络预测的过拟合程度。(2-3) Use data augmentation (Data Augmentation) technology to perform operations such as rotation, flipping, mirroring, brightness change, random offset, etc. on the standard normal distribution image converted in step (2-2), so that the network can learn to different directions and angles, while reducing the degree of overfitting of network predictions.
(2-4)将步骤(2-3)处理得到的矩阵输入全卷积网络,计算Dice Loss;(2-4) Input the matrix processed in step (2-3) into a fully convolutional network to calculate Dice Loss;
在训练分割模型的过程中,我们使用了针对分割任务的损失函数——DiceLoss进行训练,Dice Loss是针对正负例像素不均衡的图像分割而设计的损失函数,其定义如下:In the process of training the segmentation model, we used the loss function for the segmentation task - DiceLoss for training. Dice Loss is a loss function designed for image segmentation with unbalanced pixels in positive and negative examples. It is defined as follows:
其中,i表示当前计算像素点,pi,gi分别表示了像素i的模型预测分数与标签所对应的分数,N表示像素点的总数量,D表示了二分类预测结果(热力图)与标签的重合程度,所述D的取值范围为[0,1]区间,其值越接近1,重合度越高;在训练过程中,我们使用1-D作为损失函数;Among them, i represents the current calculation pixel, p i , gi represent the model prediction score of pixel i and the score corresponding to the label, N represents the total number of pixels, D represents the two-class prediction result (heat map) and The degree of overlap of the labels, the value range of D is in the [0,1] interval, the closer the value is to 1, the higher the degree of overlap; in the training process, we use 1-D as the loss function;
所述的标签是与输入图像大小一致的二值矩阵,1代表非常规细胞像素,0代表常规细胞像素,在(2-1)中有效判别区域被压缩至512*512像素,同样的,分割标签也被压缩至512*512像素以便于与有效判别区域匹配。The label is a binary matrix with the same size as the input image, 1 represents unconventional cell pixels, 0 represents conventional cell pixels, and in (2-1), the effective discriminant area is compressed to 512*512 pixels. Similarly, segmentation The labels are also compressed to 512*512 pixels in order to match the valid discriminant regions.
(2-5)使用Adam算法作为优化方法最小化Dice Loss,直至网络收敛,得到收敛的切片分割模型。(2-5) Use the Adam algorithm as an optimization method to minimize Dice Loss until the network converges, and a converged slice segmentation model is obtained.
在微调阶段,使用全连接层微调U-Net预训练权值的优点是:在U-Net训练完成后,U-Net模型深层已经具备提取非常规细胞特征的能力,进而对分类任务有较大的帮助,并且,此方法增加了分类模型的可解释性。在预测阶段,可以同时输出分类结果与分割结果进行比较,以便确定非常规细胞的区域。In the fine-tuning stage, the advantage of using the fully-connected layer to fine-tune the U-Net pre-training weights is that after the U-Net training is completed, the deep layer of the U-Net model already has the ability to extract unconventional cell features, which is more useful for classification tasks. , and this method increases the interpretability of the classification model. In the prediction stage, the classification results can be output simultaneously and compared with the segmentation results in order to identify regions of unconventional cells.
步骤(2)和(3)训练分割预训练分类模型的方法为两阶段方法,即先使用分割标签预训练,再使用分类器微调的方法,与该两阶段方法等价的方式是单阶段训练,所述的单阶段训练的方法,即使用U-Net模型的输出作为辅助损失函数与分类损失函数相加作为最终的损失函数进行最小化,但是单阶段的方法相对于两阶段的方法的效果较差。Steps (2) and (3) The method of training the segmentation pre-training classification model is a two-stage method, that is, the segmentation label pre-training is used first, and then the classifier is fine-tuned. The method equivalent to the two-stage method is single-stage training. , the single-stage training method, that is, using the output of the U-Net model as an auxiliary loss function and the classification loss function as the final loss function to minimize, but the effect of the single-stage method relative to the two-stage method poor.
步骤(3)中,所述的微调网络方法,包括以下步骤:In step (3), the fine-tuning network method includes the following steps:
(3-1)将U-Net的最后一层卷积层替换为输出为二分类的全连接层,(3-1) Replace the last convolutional layer of U-Net with a fully connected layer whose output is binary classification,
(3-2)使用含有非常规细胞的判别区域作为正例,完全不含非常规细胞的判别区域作为负例,使用Adam算法优化交叉熵损失函数并更新网络参数,使分类任务模型到达收敛,得到分割预训练分类模型。(3-2) Use the discriminative region containing unconventional cells as a positive example, and the discriminative region without unconventional cells as a negative example, use the Adam algorithm to optimize the cross-entropy loss function and update the network parameters, so that the classification task model reaches convergence, Get the segmentation pretrained classification model.
步骤(4)中,所述的通过k折交叉验证的方式训练k个普通分类模型,其具体步骤为:首先将训练数据通过分层抽样分为k等份,每次使用其中一份作为验证数据,其余k-1份作为训练数据训练,得到k个普通分类模型;In step (4), the k ordinary classification models are trained by means of k-fold cross-validation. data, the remaining k-1 copies are used as training data for training, and k ordinary classification models are obtained;
所述k的取值范围为5~10之间的整数;The value range of the k is an integer between 5 and 10;
所述的训练数据包括含有非常规细胞的判别区域作为正例和完全不含有非常规细胞的判别区域作为负例。The training data includes discriminative regions containing unconventional cells as positive examples and discriminative regions that do not contain unconventional cells as negative examples.
但是,只使用U-Net辅助提取特征的分类器对一些整体特征的分类仍然不能实现的很好分类,此时还需要增加一些分类器来保证整体分类的准确率。在本发明中我们使用了目前分类效果上泛化能力性能较好的DenseNet进行训练,在训练过程中,我们使用CrossEntropy Loss预训练模型,为了让模型更加关注困难样本,使用了Focal Loss微调参数。However, the classifier that only uses U-Net to assist in extracting features is still unable to classify some overall features very well. At this time, some classifiers need to be added to ensure the accuracy of the overall classification. In the present invention, we use DenseNet, which has better generalization performance in the current classification effect, for training. During the training process, we use the CrossEntropy Loss pre-training model. In order to make the model pay more attention to difficult samples, the Focal Loss fine-tuning parameters are used.
步骤(5)中,所述的模型集成融合方法包括:In step (5), the described model integration fusion method includes:
(1)投票法:取多个模型的输出的众数作为最终结果;或,(1) Voting method: take the mode of the outputs of multiple models as the final result; or,
(2)加权均值法:对多个模型赋予不同的权值,求取其加权均值,通过加权均值判断最终标签;或,(2) Weighted mean method: assign different weights to multiple models, obtain their weighted mean, and judge the final label by the weighted mean; or,
(3)堆叠法:训练一个以多个模型的输出作为输入的线性分类器作为判断标签的依据模型。(3) Stacking method: train a linear classifier with the output of multiple models as input as the basis for judging labels.
本发明模型集成融合方法优选加权均值法,其中,普通分类模型的权值相同;分割预训练的分类模型的权值为普通分类模型的权值的k倍,k为普通分类模型的数量。The model integration and fusion method of the present invention is preferably a weighted mean method, wherein the weights of the common classification models are the same; the weights of the pretrained classification models are k times the weights of the common classification models, and k is the number of common classification models.
优选加权均值法的原因是:加权均值法相较于投票法能够更好的平衡分割预训练分类模型与普通分类模型之间的重要程度,并有效突出分割预训练模型对最终模型的贡献。The reason why the weighted mean method is preferred is that compared with the voting method, the weighted mean method can better balance the importance of the segmentation pre-training classification model and the common classification model, and effectively highlight the contribution of the segmentation pre-training model to the final model.
根据上述步骤分别得到的分割预训练分类模型与k个DenseNet分类模型,我们使用加权均值的方式融合k+1个模型,为了突出分割模型的贡献,我们使用了下列公式进行加权平均,得到p(x)为最终预测值,x表示输入图像矩阵:According to the segmentation pre-trained classification model and k DenseNet classification models obtained respectively according to the above steps, we use the weighted mean method to fuse k+1 models. In order to highlight the contribution of the segmentation model, we use the following formula for weighted average to obtain p( x) is the final predicted value, x represents the input image matrix:
其中,S表示分割预训练分类模型函数,d表示DenseNet分类函数,k表示k折数,λ表示分割预训练分类模型所占的权重。Among them, S represents the segmentation pre-trained classification model function, d represents the DenseNet classification function, k represents the k-fold number, and λ represents the weight of the segmentation pre-trained classification model.
本发明提供的病理切片中非常规细胞的识别方法,使用了语义分割生成的特征图与多个分类网络结合,达到了较好的测试效果,以F1值的衡量算法的指标,其F1值能够达到96%以上。所述F1值为精确性与召回率的调和均值,计算方法如下:The method for identifying unconventional cells in pathological slices provided by the present invention uses a feature map generated by semantic segmentation combined with multiple classification networks, and achieves a good test effect. reached more than 96%. The F1 value is the harmonic mean of precision and recall, and the calculation method is as follows:
其中,precision代表精确度,recall代表召回率,使用以下公式计算:Among them, precision represents the precision, and recall represents the recall rate, which is calculated using the following formula:
其中tp,fp,fn分别代表真阳性数量,假阳性数量与假阴性数量。Among them, tp, fp, and fn represent the number of true positives, the number of false positives and the number of false negatives, respectively.
与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
1)本发明能够大量减轻病理医生繁重的工作量,并针对缺乏病理医生资源的基层医院,社区医院,起到较好的普及作用。1) The present invention can greatly reduce the heavy workload of pathologists, and play a better role in popularization for grassroots hospitals and community hospitals lacking pathologist resources.
2)本发明能够帮助医生快速准确的筛选出非常规细胞,以F1值的衡量算法的指标,本算法的F1值能够达到96%以上。2) The present invention can help doctors to quickly and accurately screen out unconventional cells, and the F1 value of the algorithm can reach over 96% based on the F1 value as an index of the algorithm.
附图说明Description of drawings
图1为本发明具体实施方法中全卷积网络U-net的整体结构。FIG. 1 is the overall structure of the fully convolutional network U-net in the specific implementation method of the present invention.
图2为本发明具体实施方法中普通分类模型DenseNet的整体结构。FIG. 2 is the overall structure of the general classification model DenseNet in the specific implementation method of the present invention.
图3为本发明具体实施方法的病理切片区域分类整体示意图。FIG. 3 is an overall schematic diagram of the pathological slice region classification according to the specific implementation method of the present invention.
图4为本发明具体实施方法的病理切片区域分类模型训练流程图。FIG. 4 is a flow chart of training a pathological slice region classification model according to a specific implementation method of the present invention.
具体实施方式Detailed ways
为了进一步理解本发明,下面结合具体实施方法对本发明提供的一种病理切片中非常规细胞的识别方法进行具体描述,但本发明并不限于此,该领域技术人员在本发明核心指导思想下做出的非本质改进和调整,仍然属于本发明的保护范围。In order to further understand the present invention, a method for identifying unconventional cells in a pathological section provided by the present invention will be specifically described below in conjunction with specific implementation methods, but the present invention is not limited to this. The non-essential improvements and adjustments made still belong to the protection scope of the present invention.
一种病理切片上非常规细胞的识别方法,具体步骤为:A method for identifying unconventional cells on a pathological slice, the specific steps are:
1)病理切片预处理与有效区域判别1) Pathological slice preprocessing and effective area discrimination
本发明采用输入数据为20x放大的病理切片,划分为像素分辨率2048*2048的区域,分别存储。The present invention adopts pathological slices whose input data is 20x magnification, and is divided into areas with pixel resolution of 2048*2048, which are stored separately.
将上述像素分辨率2048*2048的区域转换为LAB通道,将A通道的均值超过阈值t=132的区域作为有效判别区域,其余的舍弃。Convert the above area with a pixel resolution of 2048*2048 to the LAB channel, take the area where the average value of the A channel exceeds the threshold t=132 as an effective discrimination area, and discard the rest.
2)训练收敛的切片分割模型2) Train a converged slice segmentation model
(2-1)将步骤1)得到的有效判别区域压缩至像素分辨率512x512的区域;(2-1) Compress the effective discrimination area obtained in step 1) to an area with a pixel resolution of 512×512;
(2-2)首先将图像像素除以255,投影至[0,1]空间,然后使用zscore规范化方式对数据进行减去均值除以方差的操作以转换到标准正态分布的图像,zscore计算方法如下:(2-2) First divide the image pixels by 255, project to the [0,1] space, and then use the zscore normalization method to subtract the mean and divide the variance to convert the data to a standard normal distribution image, zscore calculates Methods as below:
其中xi表示输入数据,表示该特征的平均值,s表示该特征的标准差;where x i represents the input data, Represents the mean value of the feature, and s represents the standard deviation of the feature;
(2-3)对步骤(2-2)转换后得到的标准正态分布的图像使用数据增强(DataAugmentation)技术进行旋转,翻转,镜像,亮度改变,随机偏移等操作,以使网络能够学习到不同方向,角度的特征,同时降低网络预测的过拟合程度。(2-3) Use data augmentation (Data Augmentation) technology to perform operations such as rotation, flipping, mirroring, brightness change, random offset, etc. on the standard normal distribution image converted in step (2-2), so that the network can learn to different directions and angles, while reducing the degree of overfitting of network predictions.
(2-4)将步骤(2-3)处理得到的矩阵输入全卷积网络U-Net,其结构如图1所示,使用以下公式计算Dice Loss:(2-4) Input the matrix obtained in step (2-3) into the fully convolutional network U-Net, whose structure is shown in Figure 1, and use the following formula to calculate Dice Loss:
其中pi,gi分别表示了像素i的模型预测分数与标签所对应的分数。where pi and gi represent the model prediction score of pixel i and the score corresponding to the label, respectively.
所述的标签是与输入图像大小一致的二值矩阵,1代表非常规细胞像素,0代表常规细胞像素,在(2-1)中有效判别区域被压缩至512*512像素,同样的,分割标签也被压缩至512*512像素以便于与有效判别区域匹配。The label is a binary matrix with the same size as the input image, 1 represents unconventional cell pixels, 0 represents conventional cell pixels, and in (2-1), the effective discriminant area is compressed to 512*512 pixels. Similarly, segmentation The labels are also compressed to 512*512 pixels in order to match the valid discriminant regions.
(2-5)使用Adam算法作为优化方法最小化Dice Loss,直至网络收敛,得到收敛的切片分割模型。(2-5) Use the Adam algorithm as an optimization method to minimize Dice Loss until the network converges, and a converged slice segmentation model is obtained.
3)在步骤2)得到的切片分割模型基础上,将U-Net的最后一层卷积层替换为输出为二分类的全连接层,使用含有非常规细胞的判别区域作为正例,完全不含非常规细胞的判别区域作为负例,使用Adam算法优化交叉熵损失函数(Cross Entropy Loss)优化网络参数,使分类任务模型到达收敛,得到分割预训练分类模型。3) On the basis of the slice segmentation model obtained in step 2), the last convolutional layer of U-Net is replaced with a fully connected layer whose output is binary classification, and the discriminative region containing unconventional cells is used as a positive example. The discriminative region containing unconventional cells is used as a negative example, and the Adam algorithm is used to optimize the cross entropy loss function (Cross Entropy Loss) to optimize the network parameters, so that the classification task model reaches convergence, and the segmentation pre-trained classification model is obtained.
为了避免随机生成的全连接层权值梯度过大导致训练好的U-Net权值被破坏,我们采用以下步骤微调:In order to avoid the trained U-Net weights from being damaged due to the excessively large gradient of the weights of the randomly generated fully connected layer weights, we fine-tune the following steps:
a)固定U-Net权值只训练全连接层直至收敛;a) Fixed U-Net weights and only trains the fully connected layer until convergence;
b)降低U-Net部分的学习率,训练整体网络(包括U-Net与全连接层)。b) Reduce the learning rate of the U-Net part and train the overall network (including U-Net and fully connected layers).
4)k折折交叉验证DenseNet训练4) k-fold cross-validation DenseNet training
(4-1)将训练集分为5份,4份作为训练集,1份作为验证集;(4-1) Divide the training set into 5 parts, 4 parts are used as the training set, and 1 part is used as the verification set;
(4-2)分别对每一份划分集合,使用其他的数据作为训练集训练DenseNet模型,所述的DenseNet模型的结构如图2所示,在该验证集上效果达到最优时保存模型,得到5个模型。(4-2) For each division set, use other data as the training set to train the DenseNet model. The structure of the DenseNet model is shown in Figure 2, and the model is saved when the effect on the verification set is optimal. Get 5 models.
5)模型融合5) Model fusion
根据上述步骤分别得到的分割预训练分类模型与5个DenseNet分类模型,我们使用加权均值的方式融合6个模型,为了突出分割模型的贡献,我们使用了下列公式进行加权平均:According to the segmentation pre-trained classification model and 5 DenseNet classification models obtained by the above steps, we use the weighted average method to fuse the 6 models. In order to highlight the contribution of the segmentation model, we use the following formula for weighted average:
其中,S表示分割预训练分类模型函数,d表示DenseNet分类函数,k表示k折数,λ表示分割预训练分类模型所占的权重。Among them, S represents the segmentation pre-trained classification model function, d represents the DenseNet classification function, k represents the k-fold number, and λ represents the weight of the segmentation pre-trained classification model.
融合之后,得到最终分类模型,该模型训练流程图如图3所示。After fusion, the final classification model is obtained, and the model training flow chart is shown in Figure 3.
6)非常规细胞识别6) Unconventional cell identification
将未经过标记的新病理切片,经过步骤1)处理得到的有效判别区域输入最终分类模型,根据阈值t=132判断病理切片中每个区域含有非常规细胞的概率,输出概率值在0.5以上的非常规细胞作为识别结果。Input the unmarked new pathological slice, the effective discriminant area obtained by step 1) into the final classification model, and judge the probability that each area in the pathological slice contains unconventional cells according to the threshold t=132, and output the probability value above 0.5. Unconventional cells were identified as a result.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810097641.1A CN108346145B (en) | 2018-01-31 | 2018-01-31 | Identification method of unconventional cells in pathological section |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810097641.1A CN108346145B (en) | 2018-01-31 | 2018-01-31 | Identification method of unconventional cells in pathological section |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN108346145A CN108346145A (en) | 2018-07-31 |
| CN108346145B true CN108346145B (en) | 2020-08-04 |
Family
ID=62961468
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810097641.1A Active CN108346145B (en) | 2018-01-31 | 2018-01-31 | Identification method of unconventional cells in pathological section |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN108346145B (en) |
Families Citing this family (28)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109190682B (en) | 2018-08-13 | 2020-12-18 | 北京安德医智科技有限公司 | Method and equipment for classifying brain abnormalities based on 3D nuclear magnetic resonance image |
| CN109191476B (en) * | 2018-09-10 | 2022-03-11 | 重庆邮电大学 | Novel biomedical image automatic segmentation method based on U-net network structure |
| US10579924B1 (en) * | 2018-09-17 | 2020-03-03 | StradVision, Inc. | Learning method, learning device with multi-feeding layers and testing method, testing device using the same |
| CN109242849A (en) * | 2018-09-26 | 2019-01-18 | 上海联影智能医疗科技有限公司 | Medical image processing method, device, system and storage medium |
| JP7228031B2 (en) * | 2018-10-15 | 2023-02-22 | ベンタナ メディカル システムズ, インコーポレイテッド | Systems and methods for cell sorting |
| CN109544563B (en) * | 2018-11-12 | 2021-08-17 | 北京航空航天大学 | A passive millimeter wave image human target segmentation method for contraband security inspection |
| CN109754403A (en) * | 2018-11-29 | 2019-05-14 | 中国科学院深圳先进技术研究院 | A method and system for automatic tumor segmentation in CT images |
| CN109685077A (en) * | 2018-12-13 | 2019-04-26 | 深圳先进技术研究院 | A kind of breast lump image-recognizing method and device |
| CN109620152B (en) * | 2018-12-16 | 2021-09-14 | 北京工业大学 | MutifacolLoss-densenert-based electrocardiosignal classification method |
| CN109785334A (en) * | 2018-12-17 | 2019-05-21 | 深圳先进技术研究院 | Cardiac magnetic resonance images dividing method, device, terminal device and storage medium |
| CN109857351A (en) * | 2019-02-22 | 2019-06-07 | 北京航天泰坦科技股份有限公司 | The Method of printing of traceable invoice |
| CN110110661A (en) * | 2019-05-07 | 2019-08-09 | 西南石油大学 | A kind of rock image porosity type recognition methods based on unet segmentation |
| CN112132166B (en) * | 2019-06-24 | 2024-04-19 | 杭州迪英加科技有限公司 | Intelligent analysis method, system and device for digital cell pathology image |
| CN110634134A (en) * | 2019-09-04 | 2019-12-31 | 杭州憶盛医疗科技有限公司 | Novel artificial intelligent automatic diagnosis method for cell morphology |
| CN110853021B (en) * | 2019-11-13 | 2020-11-24 | 江苏迪赛特医疗科技有限公司 | Construction of detection classification model of pathological squamous epithelial cells |
| CN110853022B (en) * | 2019-11-14 | 2020-11-06 | 腾讯科技(深圳)有限公司 | Pathological section image processing method, device and system and storage medium |
| CN111144488B (en) * | 2019-12-27 | 2023-04-18 | 之江实验室 | Pathological section visual field classification improving method based on adjacent joint prediction |
| CN111325103B (en) * | 2020-01-21 | 2020-11-03 | 华南师范大学 | Cell labeling system and method |
| CN111340064A (en) * | 2020-02-10 | 2020-06-26 | 中国石油大学(华东) | Hyperspectral image classification method based on high-low order information fusion |
| CN111627032A (en) * | 2020-05-14 | 2020-09-04 | 安徽慧软科技有限公司 | CT image body organ automatic segmentation method based on U-Net network |
| CN112084931B (en) * | 2020-09-04 | 2022-04-15 | 厦门大学 | A method and system for classifying leukemia cell microscopic images based on DenseNet |
| US12045992B2 (en) * | 2020-11-10 | 2024-07-23 | Nec Corporation | Multi-domain semantic segmentation with label shifts |
| CN112446876B (en) * | 2020-12-11 | 2024-11-15 | 北京大恒普信医疗技术有限公司 | Image-based anti-VEGF indication identification method, device and electronic equipment |
| CN112435259B (en) * | 2021-01-27 | 2021-04-02 | 核工业四一六医院 | Cell distribution model construction and cell counting method based on single sample learning |
| CN113034448B (en) * | 2021-03-11 | 2022-06-21 | 电子科技大学 | A multi-instance learning-based method for cell recognition in pathological images |
| CN113192047A (en) * | 2021-05-14 | 2021-07-30 | 杭州迪英加科技有限公司 | Method for automatically interpreting KI67 pathological section based on deep learning |
| CN114626521A (en) * | 2022-03-16 | 2022-06-14 | 大连交通大学 | CNN model identification accuracy optimization method based on cross-validation method |
| CN115035346B (en) * | 2022-06-23 | 2024-12-13 | 温州大学 | A classification method for Alzheimer's disease enhanced by collaborative learning method |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101168067A (en) * | 2007-09-28 | 2008-04-30 | 浙江大学 | A method and application for reducing the immunogenicity of antigenic sites on the surface of immune cells |
| CN102289500A (en) * | 2011-08-24 | 2011-12-21 | 浙江大学 | Method and system for displaying pathological section multi-granularity medical information |
| CN106097391A (en) * | 2016-06-13 | 2016-11-09 | 浙江工商大学 | A kind of multi-object tracking method identifying auxiliary based on deep neural network |
| CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
| CN107145867A (en) * | 2017-05-09 | 2017-09-08 | 电子科技大学 | Face and face occluder detection method based on multi-task deep learning |
| US9782585B2 (en) * | 2013-08-27 | 2017-10-10 | Halo Neuro, Inc. | Method and system for providing electrical stimulation to a user |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7951345B2 (en) * | 2007-06-01 | 2011-05-31 | Lary Research & Development, Llc | Useful specimen transport apparatus with integral capability to allow three dimensional x-ray images |
-
2018
- 2018-01-31 CN CN201810097641.1A patent/CN108346145B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101168067A (en) * | 2007-09-28 | 2008-04-30 | 浙江大学 | A method and application for reducing the immunogenicity of antigenic sites on the surface of immune cells |
| CN102289500A (en) * | 2011-08-24 | 2011-12-21 | 浙江大学 | Method and system for displaying pathological section multi-granularity medical information |
| US9782585B2 (en) * | 2013-08-27 | 2017-10-10 | Halo Neuro, Inc. | Method and system for providing electrical stimulation to a user |
| CN106097391A (en) * | 2016-06-13 | 2016-11-09 | 浙江工商大学 | A kind of multi-object tracking method identifying auxiliary based on deep neural network |
| CN106250812A (en) * | 2016-07-15 | 2016-12-21 | 汤平 | A kind of model recognizing method based on quick R CNN deep neural network |
| CN107145867A (en) * | 2017-05-09 | 2017-09-08 | 电子科技大学 | Face and face occluder detection method based on multi-task deep learning |
Non-Patent Citations (2)
| Title |
|---|
| CONVOLUTIONS OF PATHOLOGICAL SUBMEASURES;ILIJAS FARAH;《Measure Theory》;20050117;第1-4页 * |
| 基于卷积神经网络的结肠病理图像中的腺体分割;吕力兢;《中国优秀硕士学位论文全文数据库 信息科技辑》;20170315(第3期);第I138-5141页 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108346145A (en) | 2018-07-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108346145B (en) | Identification method of unconventional cells in pathological section | |
| CN109300121B (en) | A kind of construction method of cardiovascular disease diagnosis model, system and the diagnostic device | |
| CN111191660B (en) | A multi-channel collaborative capsule network-based method for classifying pathological images of colon cancer | |
| CN111695469B (en) | Hyperspectral image classification method of light-weight depth separable convolution feature fusion network | |
| CN108447062B (en) | A segmentation method for unconventional cells in pathological slices based on a multi-scale hybrid segmentation model | |
| CN110399929B (en) | Fundus image classification method, fundus image classification apparatus, and computer-readable storage medium | |
| CN106096538B (en) | Face identification method and device based on sequencing neural network model | |
| CN110969191B (en) | Glaucoma prevalence probability prediction method based on similarity maintenance metric learning method | |
| CN108446729A (en) | Egg embryo classification method based on convolutional neural networks | |
| CN111144496A (en) | Garbage classification method based on hybrid convolutional neural network | |
| CN112101451A (en) | Breast cancer histopathology type classification method based on generation of confrontation network screening image blocks | |
| CN106934418B (en) | Insulator infrared diagnosis method based on convolution recursive network | |
| CN108734208A (en) | Multi-source heterogeneous data fusion system based on multi-modal depth migration study mechanism | |
| CN111898432A (en) | A pedestrian detection system and method based on improved YOLOv3 algorithm | |
| CN110728179A (en) | Pig face identification method adopting multi-path convolutional neural network | |
| US20240054760A1 (en) | Image detection method and apparatus | |
| CN114972208A (en) | YOLOv 4-based lightweight wheat scab detection method | |
| Feng et al. | Iris R-CNN: Accurate iris segmentation and localization in non-cooperative environment with visible illumination | |
| CN112348059A (en) | Deep learning-based method and system for classifying multiple dyeing pathological images | |
| CN113159159A (en) | Small sample image classification method based on improved CNN | |
| CN114140437A (en) | Fundus hard exudate segmentation method based on deep learning | |
| CN110390312A (en) | Chromosome automatic classification method and classifier based on convolutional neural network | |
| CN116503932B (en) | Method, system and storage medium for extracting eye periphery characteristics of weighted key areas | |
| CN108765374A (en) | A kind of method of abnormal core region screening in cervical smear image | |
| CN118430790A (en) | Mammary tumor BI-RADS grading method based on multi-modal-diagram neural network |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| EE01 | Entry into force of recordation of patent licensing contract | ||
| EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20180731 Assignee: WEIYIYUN (HANGZHOU) HOLDING Co.,Ltd. Assignor: ZHEJIANG University Contract record no.: X2025980003967 Denomination of invention: A method for identifying unconventional cells in pathological sections Granted publication date: 20200804 License type: Exclusive License Record date: 20250220 |