Detailed Description
The following is a clear and complete description of the technical method of the present invention, taken in conjunction with the accompanying drawings, and it is evident that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, are intended to fall within the scope of the present invention.
Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. The functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor methods and/or microcontroller methods.
It will be understood that, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
In order to achieve the above objective, referring to fig. 1 to 3, the present invention provides an intelligent analysis system for data before breast tumor resection, comprising the following modules:
The breast tumor case data preprocessing module is used for acquiring breast tumor case data of the medical database; performing data preprocessing on the breast tumor case data to obtain standard breast tumor case data, wherein the standard breast tumor case data comprises breast tumor pathology data and breast tumor image data;
The mammary gland tumor pathological feature processing module is used for carrying out mammary gland tumor pathological type analysis based on the mammary gland tumor pathological data to generate mammary gland tumor pathological type data; calculating a mammary tumor mitosis index based on the mammary tumor pathology data to generate the mammary tumor mitosis index;
the breast tumor target area image extraction module is used for carrying out target area image segmentation processing on the breast tumor image data to generate breast tumor target area image data;
The breast tumor growth trend analysis module is used for carrying out breast tumor metastasis mode analysis processing based on the breast tumor target area image data to generate breast tumor metastasis mode data; performing breast tumor growth trend analysis according to the breast tumor nuclear division index and the breast tumor metastasis mode data to generate breast tumor growth trend data;
The prognosis breast tumor cutting risk prediction module is used for constructing a breast tumor cutting risk prediction model; performing prognosis breast tumor target area image analysis on the breast tumor target area image data according to the breast tumor growth trend data to generate prognosis breast tumor target area image data; transmitting the image data of the target region of the prognostic mammary gland tumor to a mammary gland tumor cutting risk prediction model to predict the prognostic mammary gland tumor cutting risk, and generating prognostic mammary gland tumor cutting risk prediction data;
The intelligent feedback module of the prognosis breast tumor excision information is used for carrying out prognosis breast tumor treatment characteristic analysis according to the breast tumor pathological type data and the prognosis breast tumor target area image data so as to obtain prognosis breast tumor treatment characteristic data; designing a prognosis breast tumor treatment auxiliary decision according to the prognosis breast tumor treatment characteristic data; and transmitting the predicted breast tumor cutting risk prediction data and the predicted breast tumor treatment auxiliary decision to a terminal to execute the intelligent feedback operation of the predicted breast tumor auxiliary excision information.
The invention acquires the breast tumor case data of the medical database, can collect the breast tumor case data from a plurality of medical databases, including clinical records, image data and pathological data, and ensures the universality and comprehensiveness of data sources. The data preprocessing is carried out on the breast tumor case data, the data preprocessing can clear and standardize the data from different sources, including removing repeated data and correcting error data, the consistency and accuracy of the data are ensured, the obtained standard breast tumor case data integrates pathological data and image data, and a complete and accurate basis is provided for subsequent analysis and diagnosis. By analyzing the pathological data, the system can accurately identify different types of breast tumors and corresponding deterioration degrees, classify and categorize the breast tumors, and provide important basis for diagnosis and treatment decision of doctors. The nuclear division index is an important index for evaluating the proliferation activity of tumor cells, and the calculation is used for determining the growth speed and the potential invasiveness of the tumor, thereby being beneficial to predicting the development trend and the establishment of treatment strategies. The target area image segmentation processing is carried out on the breast tumor image data, the target area of the breast tumor is extracted from the whole image, the position, the shape and the size of the tumor are accurately positioned and identified, and the extracted target area image data not only is beneficial to the visual analysis of the diagnosis stage, but also can be used as the basic data of the follow-up growth trend analysis and the treatment planning. The analysis of the metastasis pattern of the breast tumor, i.e. how the tumor spreads and metastasizes in the breast, based on the breast tumor target area image data is crucial for predicting the development and potential risk of the tumor, and in combination with the mitosis index and metastasis pattern data, the growth rate, potential deterioration trend and treatment response of the breast tumor can be evaluated, which is helpful for formulating a targeted and effective treatment scheme. The risk of the breast tumor resection operation is estimated by analyzing the breast tumor growth trend data and the target area image data, the potential risk and complications in the operation process are predicted by utilizing big data and a machine learning technology and combining clinical experience and scientific basis, the breast tumor target area image data is subjected to prognosis breast tumor target area image analysis according to the breast tumor growth trend data, the trend characteristics of the later stage of the breast tumor such as the shape, the size and the position of the later stage of the tumor are further analyzed, the prognosis breast tumor target area image data is transmitted to a breast tumor cutting risk prediction model to predict the prognosis breast tumor cutting risk, the surgical difficulty and the individuation risk of the patient are better evaluated, so that the surgical planning is optimized, and the safety and the success rate of the surgery are improved. Performing prognosis breast tumor treatment characteristic analysis according to breast tumor pathology type data and prognosis breast tumor target area image data, including predicting tumor biological characteristics, individual differences and treatment responses of patients and breast tumor cutting paths, making personalized treatment schemes, improving treatment effectiveness and prognosis results, designing and providing intelligent treatment aid decision schemes, providing auxiliary medical advice and support before breast tumor cutting, optimizing treatment flow, improving treatment accuracy and patient satisfaction, helping medical staff to adjust treatment strategies by transmitting prognosis breast tumor cutting risk prediction data and treatment aid decisions to terminals, Ensuring that the patient is optimally supported and managed throughout the course of treatment.
In the embodiment of the present invention, as described with reference to fig. 1, a schematic block flow diagram of a data intelligent analysis system before breast tumor resection according to the present invention is provided, and in this embodiment, the data intelligent analysis system before breast tumor resection includes:
S1: the breast tumor case data preprocessing module is used for acquiring breast tumor case data of the medical database; performing data preprocessing on the breast tumor case data to obtain standard breast tumor case data, wherein the standard breast tumor case data comprises breast tumor pathology data and breast tumor image data;
In the embodiment of the invention, the access authority of a medical database containing breast tumor case data is obtained, and the breast tumor case data is extracted from the medical database by using a Structured Query Language (SQL) or a special API interface, wherein the breast tumor case data comprises basic information (such as age and sex) of a patient, clinical examination data (such as tumor size and position), pathological data (such as tissue type and tumor grading), imaging data (such as MRI, ultrasound and other images) and the like. The method comprises the steps of cleaning and integrating extracted data, including processing missing values, abnormal values and inconsistent data, ensuring data quality and consistency, standardizing data of different data sources into a unified format and unit, selecting and extracting relevant characteristics, such as key characteristics in pathological data and imaging data, according to the characteristics of the breast tumor case data, encoding or converting non-numerical data, such as single-thermal encoding or tag encoding of category data, so as to facilitate the processing of a machine learning algorithm, and carrying out normalization or standardization processing on the numerical data, so that each characteristic has similar scale in value, thereby obtaining standard breast tumor case data, wherein the standard breast tumor case data comprises processed breast tumor pathological data and breast tumor image data.
S2: the mammary gland tumor pathological feature processing module is used for carrying out mammary gland tumor pathological type analysis based on the mammary gland tumor pathological data to generate mammary gland tumor pathological type data; calculating a mammary tumor mitosis index based on the mammary tumor pathology data to generate the mammary tumor mitosis index;
In the embodiment of the invention, the histological features of the breast tumor tissue extracted from the breast tumor pathology data are subjected to preliminary processing, including data cleaning and feature extraction, the most representative features are selected for analysis according to different features (such as cell morphology, tissue structure and the like) in the pathology data, association analysis is performed on the breast tumor pathology features by using association analysis, main association features are analyzed by a principal component analysis method so as to find the similarity and the difference between different cases, the case types of the breast tumor are analyzed according to the association relationship and the principal component analysis result, and the breast tumor pathology type data, such as classification labels or descriptive information of the breast tumor of different types, are generated. According to the nuclear morphological characteristics in the pathological data, such as nuclear size, shape, chromatin distribution, chromatin change data and the like, a deep learning algorithm is designed to automatically calculate the nuclear division index of the breast tumor cells, and according to the number and distribution characteristics of the measured cell nuclei, the nuclear division index of the breast tumor cells is automatically analyzed and calculated to evaluate the proliferation activity and clinical prognosis of the tumor.
S3: the breast tumor target area image extraction module is used for carrying out target area image segmentation processing on the breast tumor image data to generate breast tumor target area image data;
In the embodiment of the invention, target region image segmentation processing is carried out on breast tumor image data, such as selecting an algorithm suitable for breast tumor image segmentation, such as a segmentation algorithm based on deep learning (such as U-Net and Mask R-CNN), a region growing method or a segmentation algorithm based on image texture characteristics, preliminary region marking or seed point initialization is carried out on the breast tumor image data, such as selecting an image region of a target gray value gradient region through gray value gradient characteristic data of the breast tumor image data as input of the segmentation algorithm, the selected algorithm is applied to accurately segment the breast tumor target region, the segmentation result is ensured to accurately reflect the boundary and region of the tumor, post-processing is carried out on the segmentation result, including removing unreasonable regions, filling holes, smoothing the boundary and the like, so as to improve the quality and continuity of the segmentation result, and breast tumor target region image data is generated.
S4: the breast tumor growth trend analysis module is used for carrying out breast tumor metastasis mode analysis processing based on the breast tumor target area image data to generate breast tumor metastasis mode data; performing breast tumor growth trend analysis according to the breast tumor nuclear division index and the breast tumor metastasis mode data to generate breast tumor growth trend data;
In the embodiment of the invention, related biological characteristics such as blood vessel density, blood vessel tissue structure and the like are extracted from the image data of a target area of the breast tumor, a preset clustering algorithm (such as K-means, hierarchical clustering and the like) is used for carrying out clustering analysis on the breast tumor metastasis modes based on the extracted characteristic data so as to reveal the similarity and the difference between different metastasis modes, breast tumor nuclear division index data and metastasis mode data are associated, an integral framework for breast tumor growth trend analysis is established, the time sequence neural network algorithm is used for analyzing the time sequence characteristic data of the breast tumor nuclear division index and the breast tumor metastasis mode data, the time sequence characteristic data is used for carrying out preliminary analysis and description on the growth trend of the breast tumor, the information including growth rate, metastasis path and the like, the relationship between the breast tumor metastasis mode and the survival condition of a patient is further analyzed, the potential influence factors of the breast tumor metastasis mode on the breast tumor prognosis are explored, and the growth trend of the breast tumor is refined, optimized and regulated based on the potential influence factors of the breast tumor prognosis, so that the breast tumor growth trend data is obtained.
S5: the prognosis breast tumor cutting risk prediction module is used for constructing a breast tumor cutting risk prediction model; performing prognosis breast tumor target area image analysis on the breast tumor target area image data according to the breast tumor growth trend data to generate prognosis breast tumor target area image data; transmitting the image data of the target region of the prognostic mammary gland tumor to a mammary gland tumor cutting risk prediction model to predict the prognostic mammary gland tumor cutting risk, and generating prognostic mammary gland tumor cutting risk prediction data;
In the embodiment of the invention, the relation mathematical model of the breast tumor cutting area and the breast tumor cutting risk is established by acquiring historical breast tumor cutting data, including clinical data related to historical breast tumor cutting, and selecting a proper machine learning or deep learning model, such as a Support Vector Machine (SVM), a random forest, a Convolutional Neural Network (CNN) and the like, so as to predict the breast tumor cutting risk. Model training is carried out by using historical breast tumor cutting data, breast tumor cutting area data of the historical breast tumor cutting data is used as input of the model training, cutting risk label data is used as output of the model training, super-parameters of the model are adjusted by means of cross verification or grid search and the like, and finally a breast tumor cutting risk prediction model is built. The processed and analyzed growth trend data is obtained from the breast tumor growth trend analysis module, and the data comprises detailed information on the growth rate, the metastasis mode, the survival relationship and the like of the breast tumor. Then, the image data extracted by the breast tumor target area image extraction module is utilized to ensure that the data contains accurate breast tumor target area images. Then, combining the growth trend data with the target area image data, and carrying out image evolution numerical analysis on the target area image of the breast tumor by using detailed information on the growth rate, the transfer mode, the survival relation and the like of the breast tumor and applying an image processing technology and an image analysis algorithm to generate prognostic breast tumor target area image data. Transmitting the image data of the target area of the prognostic mammary gland tumor to a mammary gland tumor cutting risk prediction model for predicting the cutting risk of the prognostic mammary gland tumor, and analyzing the risk information existing in the prognosis of the cutting of the mammary gland tumor, such as cutting risk probability, abnormal cutting paths and the like, by the model according to the image data of the target area of the prognostic mammary gland tumor, thereby generating prognostic mammary gland tumor cutting risk prediction data.
S6: the intelligent feedback module of the prognosis breast tumor excision information is used for carrying out prognosis breast tumor treatment characteristic analysis according to the breast tumor pathological type data and the prognosis breast tumor target area image data so as to obtain prognosis breast tumor treatment characteristic data; designing a prognosis breast tumor treatment auxiliary decision according to the prognosis breast tumor treatment characteristic data; and transmitting the predicted breast tumor cutting risk prediction data and the predicted breast tumor treatment auxiliary decision to a terminal to execute the intelligent feedback operation of the predicted breast tumor auxiliary excision information.
In the embodiment of the invention, the characteristic analysis of the pathological types of the breast tumor is carried out according to the pathological type data of the breast tumor so as to obtain the characteristic data of the pathological types of the breast tumor, wherein the characteristic data comprise the characteristic information of histological types, grading, immunohistochemical molecular subtypes and the like of the tumor, the prognosis treatment data analysis of the pathological types of the breast tumor is carried out according to the characteristic data of the pathological types of the breast tumor, the breast tumor is classified into different types such as duct cancer, lobular cancer, breast cancer and the like, the different types of the breast tumor are graded (such as low differentiation degree, medium differentiation degree and high differentiation degree) and the immunohistochemical subtypes (such as hormone receptor and HER2 expression condition) are determined, and the prognosis treatment data of the pathological types of the breast tumor are analyzed according to the information and medical files. The image data of the target area of the prognosis breast tumor relates to a corresponding breast tumor excision mode, such as breast conservation operation or mastectomy, so as to design the prognosis treatment data of the image data of the target area of the prognosis breast tumor. And analyzing the associated characteristics of the breast tumor according to the pathological type of the breast tumor and the prognosis treatment data corresponding to the image of the target area of the prognosis breast tumor, namely, the operation mode of the breast tumor under the prognosis condition and the corresponding later treatment data so as to obtain the prognosis breast tumor treatment characteristic data. And designing a prognosis breast tumor treatment auxiliary decision according to the prognosis breast tumor treatment characteristic data, namely integrating the treatment scheme types of all the prognosis breast tumor treatment characteristic data to design the prognosis breast tumor treatment auxiliary decision. And taking the operation modes corresponding to the prognosis breast tumor treatment auxiliary decision and the prognosis breast tumor cutting risk prediction data as association labels, and pushing the association labels to the terminal to execute the intelligent feedback operation of the prognosis breast tumor auxiliary excision information, so that medical staff receives the operation modes, operation risks, prognosis conditions, later treatment and the like of the breast tumor through the terminal.
Preferably, the breast tumor case data preprocessing module comprises the following functions:
obtaining breast tumor case data of a medical database;
performing breast tumor case information type analysis on the breast tumor case data to obtain breast tumor case information type data;
Designing a classification case information preprocessing decision according to the breast tumor case information type data;
And carrying out data preprocessing on the breast tumor case data according to the classified case information preprocessing decision so as to obtain standard breast tumor case data.
The invention acquires the breast tumor case data of the medical database, the system ensures that the source of the data is wide and comprehensive, integrates the breast tumor case data of patients in different areas, different hospitals and different time periods, reduces deviation and misleading results, and provides a credible basis for subsequent analysis. The system can accurately identify and classify different types of data, classify and sort the different types of data, lay a foundation for subsequent classification and analysis work, and improve the availability and efficiency of the data. According to the analyzed information type data, various preprocessing strategies and flows are formulated, different processing methods are implemented for different types of data, so that the quality and accuracy of the data are ensured, proper preprocessing decisions are designed to effectively clean and normalize the data, noise and errors in the data are eliminated, and the subsequent analysis is more accurate and reliable. According to the data preprocessing decision of the classified case information, the data preprocessing is carried out on the breast tumor case data, and the standardized processing is carried out on the breast tumor case data, including data cleaning, missing value processing, data conversion and the like, so that the consistency and the accuracy of the data are ensured, the system can reduce the time and the cost of the data processing through the optimized data preprocessing flow, the data processing efficiency is improved, and the standard breast tumor case data can be more rapidly applied to clinical and scientific research practice.
In the embodiment of the invention, the access authority of a medical database containing breast tumor case data is obtained, and the breast tumor case data is extracted from the medical database by using a Structured Query Language (SQL) or a special API interface, wherein the breast tumor case data comprises basic information (such as age and sex) of a patient, clinical examination data (such as tumor size and position), pathological data (such as tissue type and tumor grading), imaging data (such as MRI, ultrasound and other images) and the like. And according to the data source of the API interface, performing breast tumor case information type analysis on the breast tumor case data to obtain the breast tumor case information type data, wherein the breast tumor case information type data comprises image type data, pathology type data and the like of the breast tumor case data. According to the breast tumor case information type data, designing corresponding classified case information preprocessing decisions, such as standardized adjustment of image gray level, contrast and the like of image data, identifying and removing image data with excessive noise, performing text mining on pathology type data, analyzing keywords, phrases or text features, identifying and removing text anomalies. And carrying out data preprocessing on the breast tumor case data according to the classified case information preprocessing decisions, and carrying out unified data preprocessing according to the case information preprocessing decisions of each type in the classified case information preprocessing decisions, such as preprocessing decisions of image data, pathological data and the like, so as to obtain standard breast tumor case data, wherein the standard breast tumor case data comprises preprocessed breast tumor pathological data and breast tumor image data.
Preferably, the breast tumor pathology feature processing module comprises the following functions:
extracting pathological biomarker characteristic data according to the breast tumor pathological data to generate pathological biomarker characteristic data;
Performing pathological type analysis on the mammary tumor according to the characteristic data of the pathological biomarker to generate pathological type data of the mammary tumor;
and (3) carrying out analysis on the breast tumor nuclear division index according to the characteristic data of the pathological biomarker to generate the breast tumor nuclear division index.
According to the invention, the pathological biomarker characteristic data is extracted according to the pathological data of the breast tumor, and a plurality of important biomarker characteristics such as cell structures, nuclear morphology, cell distribution and the like are extracted by analyzing the pathological data of the breast tumor, so that detailed information about the biological characteristics and the pathological characteristics of the tumor is provided, and the generated pathological biomarker characteristic data not only can be used for diagnosing and classifying the breast tumor, but also can be used as input of a follow-up analysis and prediction model to help evaluate the development of the tumor and the prognosis of a patient. The system can accurately classify the pathological types of the breast tumor, such as benign tumor, malignant tumor and the like, is critical to correct pathological diagnosis and treatment planning, and different types of breast tumor need different treatment strategies, and can provide personalized treatment suggestions for doctors through accurately classifying the tumor types, thereby improving the treatment effect and the survival rate of patients. The analysis of the breast tumor nuclear division index is carried out according to the characteristic data of the pathological biomarkers, the nuclear division index is an important index for evaluating the proliferation rate of tumor cells, the nuclear division index of the breast tumor is calculated by analyzing the characteristic data of the biomarkers, and the important significance is achieved in evaluating the growth rate of the tumor, predicting the invasiveness of the tumor and selecting a proper treatment scheme.
In embodiments of the present invention, natural Language Processing (NLP) techniques are used to analyze pathology report text, named Entity Recognition (NER) algorithms are applied to identify key biomarkers, such as ER, PR, HER and Ki-67, and regular expressions or pattern matching are used to extract specific values or states of these markers. For example: ER state: positive/negative, and positive rate (e.g., 70%); PR state: positive/negative, and positive rate; HER2 state: 0/1+/2+/3+ or negative/positive; ki-67 proliferation index: percent value. Other relevant pathological features such as tumor size, grade, lymph node metastasis status and the like are extracted to generate pathological biomarker feature data. Based on the extracted biomarker characteristics, a rule based classification system is established, and the judgment rules of different breast tumor pathological types are defined according to the classification standards of WHO or other authorities. For example: HER2 negative, defined as "Luminal A type", if ER and/or PR are positive; HER2 positive, defined as "Luminal B type", if ER and/or PR positive; HER2 positive if ER and PR are negative, defined as "HER2 over-expression"; if ER, PR and HER2 are all negative, the data are defined as 'triple negative', and the data of the pathological types of the breast tumor are obtained according to each pathological type of the breast tumor analyzed. Keywords and phrases related to the nuclear division, such as 'mitotic count', 'mitoses per 10 HPF', etc. are searched in the pathological report text, specific numerical values of the nuclear division count are extracted using regular expressions, if the nuclear division index or level is directly given in the report, the information is directly extracted, if only the original count data is present, the nuclear division index is calculated according to the standard. For example: level 1:0-9 splits/10 HPF; level 2:10-19 splits/10 HPF; level 3: more than or equal to 20 splits/10 HPF. If the information of the division count is lost, the Ki-67 proliferation index is used as a substitute index in an attempt, because of a certain correlation, the division index of the breast tumor nuclei is finally generated.
Preferably, the breast tumor target area image extraction module comprises the following functions:
Performing gray value gradient feature analysis on the breast tumor image data to generate gray value gradient feature data;
Performing corner target detection processing on the breast tumor image data according to the gray value gradient characteristic data to generate breast tumor target detection data;
And performing target area image segmentation processing on the breast tumor image data according to the breast tumor target detection data to generate breast tumor target area image data.
According to the invention, gray value gradient characteristic analysis is carried out on the breast tumor image data, gray change conditions and gradient information in the image are extracted, the distinction of tumor areas and normal tissues is facilitated, the accuracy and precision of target areas are improved, the gray value gradient characteristic data reflects the fine change of the image, the details and the edge information of the tumor areas can be captured, and more information support is provided for subsequent target detection and segmentation. And (3) performing corner target detection processing on the breast tumor image data according to the gray value gradient characteristic data, effectively positioning the position and boundary of the breast tumor in the image, precisely marking the edge and key region of the tumor, providing an accurate starting point for subsequent segmentation and analysis, automatically identifying the potential tumor region by the corner target detection, reducing the complexity of manual intervention and operation, and improving the processing efficiency and consistency. The breast tumor target area image data is applied to clinical diagnosis and treatment plans, so that doctors can better know the size, shape and position of tumors, and support is provided for the establishment of personalized treatment schemes.
In the embodiment of the invention, gray value gradient feature analysis is performed on breast tumor image data, an input color breast image is converted into a gray image by using cvtColor functions of an OpenCV library, then a GaussianBlur function is applied to perform Gaussian filtering (a kernel size is set to be 5x5, a sigma value is set to be 1.5) so as to reduce noise, gradients in x and y directions are calculated by using Sobel functions (ksize is set to be 3) respectively, and gradient amplitude and gradient directions are calculated by using sqrt and arctan2 functions of the numpy library. Then a gradient magnitude histogram of the whole image was calculated (using a histogram function of numpy, bin set to 256), and the image was divided into 16x16 grids, and HOG features were calculated for each grid using a HOG function of skimage library (orientations set to 9, pixels per cell set to (8, 8), cells per block set to (2, 2)). Meanwhile, the LBP feature of each grid is calculated using the local_binary_pattern function of skimage (P value is set to 8,R value is set to 1). The global gradient histogram, the HOG feature and the LBP feature are connected into a feature vector by using a conflate function of numpy, then the PCA class of sklearn library is used for dimension reduction (n_components are set to 0.95, namely 95% variance information is reserved), the MinMaxScaler class of sklearn is used for normalizing the dimension reduced feature, and the value is scaled to the range of [0, 1] to obtain final gray value gradient feature data. And (3) performing corner target detection processing on the breast tumor image data according to the gray value gradient characteristic data, performing Harris corner detection (blockSize is set to 2, ksize is set to 3, and k is set to 0.04) by using a cornerHarris function of OpenCV, and then screening salient corners by setting a threshold (the threshold value is cornerHarris and returns 1% of the maximum value of the result). The corner locations are then sub-pixel refined using cornerSubPix functions (winSize set to (5, 5), zeroZone set to (-1, -1), criterion set to (cv2.term CRITERIA _eps+cv2.term CRITERIA _max_iter, 40, 0.001)). The BRIEF descriptors (bytes set to 32) for the 11x11 area around each corner are then calculated using BRIEF classes. Feature matching was performed using FlannBasedMatcher classes (index_params set to dict(algorithm = FLANN_INDEX_LSH, table_number = 6, key_size = 12, multi_probe_level = 1),search_params set to subject (checks =50)). The matched corner points are clustered using the sklearn DBSCAN class (eps set to 20, min_samples set to 5). Calculating a minimum circumscribed rectangle by using a cv2. MinArearact function for each cluster as a candidate target region, calculating the correlation of each candidate region with a pre-trained tumor model by using a correlate d function of scipy as a score by using previously generated gray value gradient feature data, removing an overlapped region by using a non-maximum suppression algorithm (an implementation can refer to tf.image. Non-max_support function of tensorflow, and the iou_threshold is set to 0.5), and removing the reserved candidate region information (including the position, size, score, etc.) to obtain breast tumor target detection data. Performing target region image segmentation processing on breast tumor image data according to breast tumor target detection data, creating an initial segmentation mask by using numpy zeros function based on the target detection data, then performing fine segmentation (alpha is set to 0.1, beta is set to 0.1, gamma is set to 0.1) by using a cv2.Snake function to realize an active contour algorithm, then using a detected target center as a seed point to realize a region growing algorithm (a threshold is set to 20, namely, a region is included when the gray level difference between a new pixel and the seed point is smaller than 20), using cv2.Distance transform and cv2.Watershed functions to realize boundary optimization by using a watershed algorithm, the segmentation results were subjected to open and closed operations (kernel size set to 5x5, candidates set to 2) using a cv2. Morphyox function to smooth the boundaries. the majority voting method (the final class of each pixel is determined by the result majority of the three methods of active contour, region growing and watershed) is implemented in combination with the results of the different segmentation methods. Using pydensecrf library to achieve spatial consistency of conditional random field optimized segmentation boundaries (pos_w set to 3, pos_xy_std set to 1, bi_w set to 4, bi_xy_std set to 67, bi_rgb_std set to 3), computing morphological features (such as area, perimeter, roundness, etc.) and texture features of the segmented region (computing GLCM features using greycoprops function of skimage, including contrast, correlation, energy, etc.), creating an HDF5 file using h5py library, storing the original image, segmentation mask, boundary coordinates, and extracted features therein, so as to obtain final breast tumor target area image data.
Preferably, the breast tumor growth trend analysis module comprises the following functions:
Performing characteristic analysis of mammary tumor blood vessel fluid based on the image data of the mammary tumor target area to generate mammary tumor blood vessel fluid characteristic data;
Performing mammary tumor metastasis feature analysis according to the mammary tumor blood vessel fluid feature data to generate mammary tumor metastasis feature data;
Performing breast tumor metastasis mode analysis processing according to the breast tumor metastasis characteristic data to generate breast tumor metastasis mode data;
And carrying out breast tumor growth trend analysis according to the breast tumor nuclear division index and the breast tumor metastasis mode data to generate breast tumor growth trend data.
The invention carries out the characteristic analysis of the mammary tumor blood vessel fluid based on the image data of the target area of the mammary tumor, reveals the blood vessel structure and the blood flow dynamics of the tumor, has important significance for evaluating the blood supply condition of the tumor, predicting the growth speed of the tumor and prognosis, and the generated blood vessel fluid characteristic data can provide detailed information about the blood flow speed, the blood vessel density, the blood vessel morphology and the like, thereby helping doctors to understand the nutrition supply state and the internal environment of the tumor. The method is characterized in that mammary tumor metastasis feature analysis is carried out according to mammary tumor vascular fluid feature data, the metastasis tendency and possibility of tumors are evaluated, the method has important significance in judging invasiveness of the tumors and selecting treatment strategies, and the generated metastasis feature data can help doctors to predict metastasis potential of the tumors and provide support for making personalized treatment plans and prognosis evaluation. And (3) performing breast tumor metastasis mode analysis processing according to the breast tumor metastasis characteristic data, identifying and classifying different metastasis modes, such as lymphatic metastasis, blood metastasis and the like, understanding the metastasis path and the diffusion mode of tumors, pertinently preventing or treating the metastasis of the tumors, and improving the treatment effect and the survival rate of patients. The growth trend analysis of the breast tumor is carried out according to the nuclear division index of the breast tumor and the metastasis mode data of the breast tumor, and the growth trend analysis is carried out by combining the nuclear division index and the metastasis mode data, so that the growth rate and proliferation activity of the breast tumor can be estimated, detailed information about the growth and development of the tumor is provided, and the scientific basis of prognosis evaluation and decision is supported.
As an example of the present invention, referring to fig. 2, a functional flow chart of the breast tumor growth trend analysis module in fig. 1 is shown, where the functions of the breast tumor growth trend analysis module in this example include:
S401: performing characteristic analysis of mammary tumor blood vessel fluid based on the image data of the mammary tumor target area to generate mammary tumor blood vessel fluid characteristic data;
In the embodiment of the invention, frangi filtering (scale_range is set to be 1, 10), scale_step is set to be 2, beta1 is set to be 0.5, and beta2 is set to be 15) is performed on the target region image by using a skin.filters.fields function of scikit-image library to strengthen the vascular structure, skeletonizing is performed on the enhanced image by using the skin.model function, a vascular centerline is extracted, a vascular network graph is constructed by using a networkx library, the vascular centerline is converted into a graph structure (nodes represent vascular branch points, edges represent vascular segments), topological characteristics of the network such as average degree centrality (nx.degrex centrality), medium number centrality (nx.beta_ centrality) and tightness centrality (nx.closure_ centrality) are calculated, connected domain analysis is performed by using a skin.model.skin function, and the feature such as the ratio of vascular density (vascular density to total number of pixels) and the number of branches and the number of points of the vascular curve are calculated. The optical flow field between two consecutive frames was estimated using the cv2. Calcopticalflowfarnebback function of OpenCV to analyze hemodynamic characteristics, and all extracted characteristics (including topological, density, complexity, morphological and kinetic) were used to derive mammary tumor vascular fluid characterization data.
S402: performing mammary tumor metastasis feature analysis according to the mammary tumor blood vessel fluid feature data to generate mammary tumor metastasis feature data;
In the embodiment of the invention, a sklearn.cluster.kmeans is utilized to perform cluster analysis (n_ clusters is set to 3, blood vessel characteristics are classified into three types of risks of low, medium and high according to experience) so as to identify a potential transfer risk mode, a sklearn.enstable.random forest class is used to construct a random forest model (n_ estimators is set to 100, max_depth is set to 10), the blood vessel characteristics and the known transfer state are subjected to association analysis, so that characteristic importance ranking is obtained, the ROC-AUC value of each characteristic is calculated by using sklearn.metrics.roc_ AUC _score, and the contribution of the ROC-AUC value to transfer prediction is estimated. Hypothesis testing (e.g., t-test or Mann-Whitney U test) was performed using the scipy.stats module to compare the significance of differences in features between the transferred and non-transferred groups. The Radiomics feature extraction is implemented, and the pyradiomics library is used to calculate first order statistics, shape features, texture features, etc. And (3) performing recursive feature elimination by using sklearn. Feature_selection. RFE, selecting the most relevant feature subset, and obtaining breast tumor metastasis feature data by screening the features, importance scores and statistical test results.
S403: performing breast tumor metastasis mode analysis processing according to the breast tumor metastasis characteristic data to generate breast tumor metastasis mode data;
In the embodiment of the invention, the sklearn. Manifold. TSNE is used for t-SNE dimension reduction (n_components are set to 2, and superplexity is set to 30) so as to visualize the high-dimensional feature distribution. Density clustering (eps set to 0.5, min_samples set to 5) was performed using sklearn.cluster.DBSCAN, different transition pattern groups were identified, feature association networks were built using networkx, edge weights were based on inter-feature information (calculated using sklearn.metrics.MUTUS_info_score), community detection was performed using networkx.algorithms.community.greedy_ modularity _communities functions, and feature modules were found. A deep learning model based on a graph-rolling network is realized (pytorch _ geometric library is used) to capture complex nonlinear relations among features, a Cox proportion risk model is realized by using lifelines library, the influence of different feature combinations on transfer time is analyzed, a decision tree model (max_depth is set to be 5) is constructed by using sklearn. Tree. Decistern, an interpretable transfer rule is extracted, SHAP (SHAPLEY ADDITIVE exPlanations) library is used to calculate the SHAP value of the features, model prediction is interpreted, a Markov Chain Monte Carlo (MCMC) method is realized, bayesian inference is performed, posterior distribution of transfer probability is estimated, and clustering results, network analysis results, survival analysis results, decision rules, SHAP values and the like are obtained to obtain breast tumor transfer mode data.
S404: and carrying out breast tumor growth trend analysis according to the breast tumor nuclear division index and the breast tumor metastasis mode data to generate breast tumor growth trend data.
In the embodiment of the invention, the growth trend analysis of the breast tumor is carried out according to the breast tumor nuclear division index and the breast tumor metastasis mode data, the correlation coefficient of the nuclear division index and each metastasis characteristic is calculated by applying scipy.stats.pearsonr, a statsmodels library is used for realizing a Generalized Additive Model (GAM), the nonlinear influence of the nuclear division index and the metastasis characteristic on the tumor volume is analyzed, a Dynamic Time Warping (DTW) algorithm is realized, and the similarity of tumor growth curves of different patients is compared. And constructing a time sequence prediction model by using prophet libraries, predicting tumor volume change in a short period, and using sklearn. And a cellular automaton model is realized, and the tumor growth process considering space constraint is simulated. And (3) solving a normal differential equation set by using scipy, integrating, namely, point, realizing a tumor growth mathematical model based on a Gompertz equation, and using PyMC libraries to realize a Bayesian hierarchical model, and estimating growth parameters of a population level and an individual level. Survival curves under different growth modes were estimated using survival analysis techniques (class KAPLANMEIERFITTER using lifelines library). The Monte Carlo simulation is realized (using a numpy.random module), a large number of possible growth trajectories are generated, the uncertainty of the prediction is evaluated, and the prediction result, model parameters, survival curve, uncertainty estimation and the like are evaluated to obtain breast tumor growth trend data.
Preferably, the performing the fluid characterization of the breast tumor blood vessel based on the image data of the breast tumor target area includes:
performing target area blood vessel density analysis on the breast tumor target area image data to generate breast tumor target area blood vessel density data;
performing target area fluid characteristic analysis on the breast tumor target area image data to generate breast tumor target area fluid characteristic data;
And performing mammary tumor blood vessel fluid characteristic analysis according to the mammary tumor target area blood vessel density data and the mammary tumor target area fluid characteristic data to generate mammary tumor blood vessel fluid characteristic data.
The invention analyzes the target area blood vessel density of the target area image data of the breast tumor to evaluate and evaluate the blood supply condition of the tumor, the high blood vessel density is usually related to the sufficient nutrition supply and strong invasiveness of the tumor, and the generated blood vessel density data can be used as a standard for evaluating the severity and the diffusion degree of the lesion of the breast tumor. The target area fluid characteristic analysis is carried out on the breast tumor target area image data, the liquid distribution and the movement condition inside the tumor are revealed, the internal structure and the liquid composition of the tumor are known, an important basis is provided for clinical diagnosis and treatment strategies, the generated fluid characteristic data can reflect the dynamic change of the liquid inside the tumor, including the accumulation, diffusion and discharge conditions of the liquid, and the monitoring of the growth of the tumor and the effect of the response treatment are facilitated. And performing mammary tumor blood vessel fluid characteristic analysis according to the mammary tumor target area blood vessel density data and the mammary tumor target area fluid characteristic data, so that the vascular system and the liquid dynamic characteristics of the mammary tumor are more comprehensively known, and the generated blood vessel fluid characteristic data not only assist in diagnosing the properties and characteristics of the tumor, but also provide scientific basis for formulating a personalized treatment strategy and optimize the treatment effect and the prognosis of a patient.
In the embodiment of the invention, target region blood vessel density analysis is carried out on breast tumor target region image data, a cv2.findContours function is used for detecting the outline, a cv2.drawContours function is used for drawing the blood vessel outline, the blood vessel area occupation ratio (the number of blood vessel pixels divided by the total number of pixels) is calculated as the whole blood vessel density, a skimage.measure.label function is used for marking the blood vessel, then a skimage.measure.regionprops function is used for calculating the attribute (such as area, perimeter, eccentricity and the like) of each blood vessel region, A histogram of vessel size distribution is calculated using a histogram function of numpy, and the distance from each non-vessel pixel to the nearest vessel is calculated using a scipy. Based on the distance map, mean vessel spacing and its standard deviation are calculated using the mean and std functions of numpy. Finally, the calculated indexes (overall density, size distribution, average spacing and the like) are packaged by DATAFRAME of pandas library so as to obtain the blood vessel density data of the target area of the breast tumor. Performing target area fluid feature analysis on breast tumor target area image data, calculating an optical flow field between two consecutive frames of images using a cv2.Calcoptical flow Farneback function of OpenCV (pyr_scale set to 0.5, levels set to 3, winize set to 15, items set to 3, poly_n set to 5, poly_sigma set to 1.2), calculating an amplitude of the optical flow field using a hypot function of numpy, calculating an average flow rate using a mean function, calculating a standard deviation of flow rates using a std function, calculating skewness (skew) and kurtosis (kurtosis) of flow velocity distribution by using a scipy.stats module, calculating spatial gradient of a flow field by using a gradient function of numpy, further calculating divergence (diversity) and curl (curl), clustering the flow field by using a sklearn.cluster.kmeans function (n_ clusters is set to 3), identifying different flow modes, performing power spectral density analysis by using a scipy.signal.welch function, identifying main frequency components of the flow field, the entropy of the flow field was calculated using the skin. Measure. Shannon_entropy function to evaluate the flow complexity. particle Image Velocimetry (PIV) based analysis was implemented, local flow velocity vectors were calculated using the process_pair function of openpiv library (window_size set to 32 and overlap set to 16), and all extracted features (average flow velocity, standard deviation, skewness, kurtosis, divergence, rotation, clustering results, spectral features, entropy, PIV results, etc.) were packaged using DATAFRAME of pandas to obtain breast tumor target region fluid feature data. Performing mammary tumor blood vessel fluid characteristic analysis according to mammary tumor target region blood vessel density data and mammary tumor target region fluid characteristic data, combining the two groups of characteristics by using a confusing function numpy, performing principal component analysis (n_components are set to be 0.95) by using sklearn. Deconvolution. PCA, reducing characteristic dimension, calculating a correlation coefficient matrix between blood vessel density characteristics and fluid characteristics by using scipy. Stats. Pearson, constructing a characteristic correlation network by using networkx library, the edge weights are based on the absolute values of the correlation coefficients. The centrality index (such as centrality, medium centrality, feature vector centrality) of each node in the network is calculated by using a networkx.algorithms.centrality module, sklearn.cluster.agglameracive cluster is applied to perform hierarchical clustering (n_ clusters is set to 4), a feature combination mode is identified, a sklearn.ensable.random forest model is constructed (n_ estimators is set to 100, max_depth is set to 10), analyzing the influence degree of the characteristics on the tumor growth, calculating SHAP (SHAPLEY ADDITIVE exPlanations) values by applying SHAP library, explaining model prediction, performing hypothesis test (such as t test or Mann-Whitney U test) by using a scipy.stats module, comparing the vascular fluid characteristic differences of different subtype tumors, realizing a Bayesian network model based on PyMC3, deducing the causal relationship among the characteristics, and analyzing the results (including the characteristics after dimension reduction, a correlation coefficient matrix, a network centrality index, a target value and a target value, clustering results, feature importance, SHAP values, statistical test results, causal relationship graphs, etc.) as final breast tumor vessel fluid feature data.
Preferably, the performing breast tumor metastasis pattern analysis processing according to the breast tumor metastasis characteristic data comprises:
Performing breast tumor metastasis feature cluster analysis on the breast tumor metastasis feature data based on a preset clustering algorithm to generate clustered breast tumor metastasis feature data;
and carrying out analysis processing on the breast tumor metastasis mode according to the clustered breast tumor metastasis characteristic data, and generating breast tumor metastasis mode data.
According to the invention, breast tumor metastasis feature cluster analysis is carried out on breast tumor metastasis feature data based on a preset clustering algorithm, breast tumor cases with similar metastasis features are clustered through the cluster analysis, different metastasis modes such as lymphatic metastasis and blood metastasis are identified and distinguished, the complex metastasis feature data is reduced to fewer representative clusters, the data structure is simplified, and doctors are helped to better understand the diversity of the metastasis modes of breast tumors. The breast tumor metastasis pattern analysis processing is carried out according to the clustered breast tumor metastasis feature data, the breast tumor metastasis patterns in each cluster are deeply analyzed according to the clustered breast tumor metastasis feature data, including the metastasis types, frequencies and paths, clear metastasis trend assessment is provided, the prediction of tumor development and prognosis of patients is facilitated, the generated metastasis pattern data provide basis for the establishment of personalized treatment strategies, and the most suitable treatment scheme and management strategy are selected according to the characteristics of different metastasis patterns so as to optimize the treatment effect and survival rate of the patients.
In the embodiment of the invention, the breast tumor metastasis feature data is subjected to breast tumor metastasis feature cluster analysis based on a preset clustering algorithm, the sklearn.cluster.kmens is adopted as the preset clustering algorithm (n_ clusters is set to be 5, and the optimal cluster number is determined according to experience or through a contour coefficient), and the breast tumor metastasis feature data is clustered. Cluster evaluation indexes such as a contour coefficient (silhouette_score) and Calinski-Harabasz index (calinski _ harabasz _score) are calculated using a sklearn. Metrics module, t-SNE dimension reduction is performed using a sklearn. Manifold. Tsne (n_components set to 2 and superplexity set to 30), and the center point (centroid) of each cluster and the mean and standard deviation of each feature are calculated. And carrying out hypothesis testing (such as ANOVA or Kruskal-Wallis testing) on each cluster by using a scipy.stats module, identifying the characteristics of the clusters which are obviously distinguished, and taking the cluster labels, the characteristics after dimension reduction, the cluster evaluation indexes, the cluster center points, the characteristic statistics and the saliency testing result as clustered breast tumor metastasis characteristic data. And (3) performing breast tumor metastasis mode analysis processing according to the clustered breast tumor metastasis characteristic data, constructing a decision tree model (max_depth is set to be 5) by using sklearn. Tree. Decission TreeClassification, and extracting interpretable metastasis rules. A random forest model was constructed using sklearn. Ensable. Random forest class (n_ estimators set to 100 and max_depth set to 10) and feature importance was analyzed. The snap values are calculated using SHAP libraries, explaining the model's predictions for each cluster. And realizing a Cox proportion risk model by applying lifelines libraries, and analyzing survival characteristics of different clusters. Using networkx to construct a feature association network, using edge weights to perform community detection by using a networkx.algorithms.algorism.communicator_ modularity _communities function based on mutual information among features (using sklearn.metrics.combination_info_score computation), finding a feature module, performing hypothesis test by using a scipy.stats module, comparing differences of various clinical indexes among different clusters, performing Bayesian inference by using a Markov Chain Monte Carlo (MCMC) method, estimating transition probability distribution of each cluster, and obtaining transition rules, feature importance, SHAP values, survival analysis results, network analysis results, model performance evaluation, statistical test results, probability distribution and the like, thereby generating breast tumor transition mode data.
Preferably, the analysis of the breast tumor growth trend according to the breast tumor nuclear division index and the breast tumor metastasis mode data comprises:
performing transfer mode time sequence characteristic analysis on the breast tumor transfer mode data to generate transfer mode time sequence characteristic data;
performing preliminary analysis on the growth trend of the breast tumor according to the time sequence characteristic data of the transfer mode, and generating preliminary breast tumor growth trend data;
Performing breast tumor metastasis survival relationship analysis according to the metastasis pattern time sequence characteristic data to generate breast tumor metastasis survival relationship data;
Analyzing the breast tumor nonlinear growth factor according to the breast tumor metastasis survival relationship data and the breast tumor nuclear division index to generate a breast tumor nonlinear growth factor;
And performing the optimization and adjustment of the breast tumor growth trend precision on the preliminary breast tumor growth trend data based on the breast tumor nonlinear growth factor so as to obtain the breast tumor growth trend data.
According to the invention, the time sequence characteristic analysis of the metastasis mode of the breast tumor is carried out on the data of the metastasis mode, the dynamic change in the metastasis process is revealed, the dynamic change is used for assisting in knowing the development trend and the mode change of the tumor metastasis, basic data is provided for the subsequent growth trend analysis, and the generated time sequence characteristic data shows the time sequence change of the metastasis mode of the breast tumor so as to identify time-dependent factors related to the growth trend. And (3) performing preliminary analysis on the growth trend of the breast tumor according to the time sequence characteristic data of the metastasis mode, wherein the preliminary analysis on the growth trend of the breast tumor can predict the growth rate and the preliminary development direction of the tumor, provides a foundation for the formulation of a treatment scheme, and is beneficial to early intervention and treatment optimization. And (3) performing breast tumor metastasis survival relation analysis according to the metastasis pattern time sequence characteristic data, wherein the analysis of the breast tumor metastasis pattern and the survival relation can help to evaluate the prognosis condition of a patient, and the generated survival relation data reveals the influence of different metastasis patterns on the survival time of the patient. The analysis of the nonlinear growth factor of the breast tumor is carried out according to the breast tumor metastasis and survival relationship data and the breast tumor nuclear division index, the analysis of the nonlinear growth factor can help to understand the dynamic process of the breast tumor growth, including the change of the growth rate and the possible growth trend, as an important reference for predicting the tumor development speed and the behavior, the generated nonlinear growth factor data is helpful to formulate a personalized treatment strategy, and the treatment scheme is regulated according to the dynamic change of the tumor growth so as to improve the treatment effect and the survival rate of patients. The breast tumor growth trend accuracy optimization adjustment is carried out on the preliminary breast tumor growth trend data based on the breast tumor nonlinear growth factors, and the accuracy and the prediction capability of the growth trend analysis can be improved by optimizing the growth trend data by combining the nonlinear growth factors, so that the growth mode and the development trend of the breast tumor can be known more accurately, and a more reliable basis is provided for treatment decisions.
In the embodiment of the invention, the time sequence characteristic analysis of the breast tumor metastasis mode data is carried out, the time sequence analysis tool in statsmodels libraries, such as ARIMA model, is utilized to extract and analyze the time sequence characteristic of the breast tumor metastasis mode data, the specific time sequence characteristic data of the metastasis mode is generated by methods of calculating moving average trend analysis and the like, and the data distribution analysis is carried out by combining with scipy. Performing primary analysis of breast tumor growth trend according to the time sequence characteristic data of the transfer mode, realizing primary analysis of breast tumor growth trend by applying a Generalized Additive Model (GAM) based on the generated time sequence characteristic data of the transfer mode, using a GAM tool in statsmodels library of Python, combining the time sequence characteristic data of the transfer mode of the breast tumor, establishing a nonlinear model to describe the influence of a nuclear division index and each transfer characteristic on the tumor volume, and performing variable selection and parameter optimization by using the GAM model to generate primary breast tumor growth trend data including predicted tumor volume change and related uncertainty analysis. Performing breast tumor metastasis survival relationship analysis according to the metastasis pattern time sequence feature data, performing breast tumor metastasis survival relationship analysis on the metastasis pattern time sequence feature data by using a survival analysis technology, such as KAPLANMEIERFITTER types in lifelines library of Python, performing Kaplan-Meier method estimation on a patient survival function, considering the influence of the breast tumor metastasis pattern on the survival time, and evaluating the survival difference between different metastasis patterns through Log-rank test or Cox proportion risk model to generate breast tumor metastasis survival relationship data. According to the breast tumor metastasis survival relationship data and the breast tumor nuclear division index, breast tumor nonlinear growth factor analysis is carried out, the breast tumor metastasis survival relationship data and the nuclear division index are combined, a odeint function in a scipy.integerate library of Python is utilized to realize a tumor growth mathematical model based on a Gompertz equation, and the growth parameters of the breast tumor nonlinear growth factor, namely the Gompertz model, are estimated by solving a normal differential equation set, so that the growth dynamics of the breast tumor under different metastasis modes and the nuclear division index conditions are reflected, and a foundation is provided for the subsequent growth trend data analysis. performing breast tumor growth trend accuracy optimization adjustment on preliminary breast tumor growth trend data based on breast tumor nonlinear growth factors, constructing a time sequence prediction model by utilizing prophet libraries in Python, performing trend analysis by taking the breast tumor nonlinear growth factors as input data, predicting breast tumor volume change in a short period, realizing gradient lifting regression trees by adopting GradientBoostingRegressor in a sklearn. Ensemble library, predicting long-term tumor growth trend, evaluating prediction uncertainty by combining Monte Carlo simulation, and generating final breast tumor growth trend data by optimizing and adjusting the accuracy of the preliminary breast tumor growth trend data.
Preferably, the prognostic breast tumor cut risk prediction module includes the following functions:
acquiring historical breast tumor cutting data of a medical database, wherein the historical breast tumor cutting data comprises historical breast tumor cutting area data and historical cutting risk tag data;
based on a preset convolutional neural network algorithm and historical breast tumor cutting data, building a relation prediction model of a breast tumor cutting area and cutting risks so as to build a breast tumor cutting risk prediction model;
Performing prognosis breast tumor target area image analysis on the breast tumor target area image data according to the breast tumor growth trend data to generate prognosis breast tumor target area image data;
Transmitting the image data of the target region of the prognosis breast tumor to a breast tumor cutting risk prediction model for predicting the prognosis breast tumor cutting risk, and generating prognosis breast tumor cutting risk prediction data.
According to the invention, the historical breast tumor cutting data of the medical database is obtained so as to accumulate a large number of actual operation records, a rich data basis is provided for model establishment, the prediction accuracy of cutting risk and the adaptability of the model are helped to be evaluated, the historical cutting risk label data provides marked training samples for the model, the accurate cutting risk prediction model is helped to be established, and the prediction reliability and accuracy are improved. Based on a preset convolutional neural network algorithm and historical breast tumor cutting data, a relation prediction model of a breast tumor cutting area and cutting risks is established, complex relations between the breast tumor cutting area and the cutting risks can be effectively learned by using the convolutional neural network algorithm, spatial relations and characteristics in image data are processed, and prediction accuracy of the cutting risks is improved. The image analysis of the target area of the breast tumor is performed on the image data of the target area of the breast tumor according to the growth trend data of the breast tumor, the image analysis is performed by combining the growth trend data, the image data of the target area of the breast tumor can be updated in real time, the tumor state and the development trend of the patient can be reflected more accurately, the auxiliary adjustment treatment strategy, such as planning the cutting decision of the breast tumor, the necessary intervention measures can be adopted before the disease condition is worsened, and the improvement of the treatment success rate and the survival rate of the patient can be facilitated. The image data of the target region of the prognosis breast tumor is transmitted to a breast tumor cutting risk prediction model for predicting the prognosis breast tumor cutting risk, so that a breast tumor cutting risk prediction result aiming at an individual patient can be generated, personalized operation risk assessment and decision support are provided, and the possibility of occurrence of operation risks and complications is reduced.
As an example of the present invention, referring to fig. 2, a functional flow diagram of the prognostic breast tumor cut risk prediction module in fig. 1 is shown, where the prognostic breast tumor cut risk prediction module in this example includes:
S501: acquiring historical breast tumor cutting data of a medical database, wherein the historical breast tumor cutting data comprises historical breast tumor cutting area data and historical cutting risk tag data;
In the embodiment of the invention, historical breast tumor cutting data in a medical database, including breast tumor cutting area data and cutting risk label data, is acquired, and data is loaded and cleaned by using a pandas library in Python, so that the data quality is ensured. For breast tumor cut area data, it is necessary to parse the image file or read DICOM-formatted image data using a medical image processing library (e.g., pydicom). Meanwhile, the historical cutting risk tag data is arranged, wherein the historical cutting risk tag data comprises risk level classification or probability distribution. Ensuring data integrity and privacy security, and adopting proper data encryption and access control measures.
S502: based on a preset convolutional neural network algorithm and historical breast tumor cutting data, building a relation prediction model of a breast tumor cutting area and cutting risks so as to build a breast tumor cutting risk prediction model;
In the embodiment of the invention, a relation prediction model of a breast tumor cutting area and cutting risk is established by utilizing a Convolutional Neural Network (CNN) algorithm, and a TensorFlow or PyTorch framework in Python is used for establishing the CNN model. Suitable network structures are designed, such as multi-layer convolutional layers, pooling layers, and fully-connected layers. In the model training phase, historical breast tumor cut data is divided into a training set and a verification set. Model performance is assessed by cross-validation or retention set methods. Model training is performed using appropriate loss functions (e.g., cross entropy loss), optimizers (e.g., adam optimizers), and learning rate scheduling strategies. And the GPU is utilized to accelerate the training process so as to accelerate the calculation speed and improve the training efficiency. And predicting a cutting area and cutting risk of a new breast tumor cutting image according to the trained CNN model. By means of interpretation analysis and uncertainty evaluation of the prediction results, reliability and accuracy of the model in actual clinical application are improved. The performance index of the model is monitored in real time and model adjustment and retraining is performed as necessary to ensure its continued effectiveness and adaptability.
S503: performing prognosis breast tumor target area image analysis on the breast tumor target area image data according to the breast tumor growth trend data to generate prognosis breast tumor target area image data;
In the embodiment of the invention, dynamic contour segmentation is realized by using an active_contour function of a skin.segment module, and a tumor boundary is extracted. And (3) realizing image matrix operation based on numpy libraries, and amplifying or deforming the segmentation result according to the breast tumor growth trend data. And (3) adjusting the size of the image by using a cv2.reserve function of an opencv-python library to ensure that the size is consistent with the input size of the model, converting the processed image data into numpy arrays, and storing the image data into a npy format by using a save function of numpy to serve as image data of a target area of the prognostic mammary tumor.
S504: transmitting the image data of the target region of the prognosis breast tumor to a breast tumor cutting risk prediction model for predicting the prognosis breast tumor cutting risk, and generating prognosis breast tumor cutting risk prediction data.
In the embodiment of the invention, the load function numpy is used to load the preprocessed image data of the target area of the prognostic breast tumor, the torch. From_ numpy function is used to convert the numpy array into PyTorch tensors, and the data is moved to the GPU (if available) by using the device. The breast tumor cut risk prediction model parameters saved before loading, the model was set to an evaluation mode (model. Eval ()), and gradient computation was disabled using the torch. No_grad () context manager. And obtaining a prediction result through model forward propagation. The predicted categories are obtained using torch.argmax, which calculates the probability distribution for each category. The prediction results are converted back into numpy arrays, and the predicted risk category, probability and other information is packaged by using DATAFRAME of pandas to generate prognosis breast tumor cutting risk prediction data, including breast tumor cutting operation types, corresponding cutting risk probabilities and the like.
Preferably, the establishing the relation prediction model for the breast tumor cutting area and the cutting risk based on the preset convolutional neural network algorithm and the historical breast tumor cutting data includes:
taking the historical breast tumor cutting area data as input data and the historical cutting risk tag data as output data to design a breast tumor cutting risk training set;
establishing a mapping relation between a breast tumor cutting area and a cutting risk based on a preset convolutional neural network algorithm to construct a breast tumor cutting risk prediction model framework;
transmitting the breast tumor cutting risk training set to a breast tumor cutting risk prediction model framework for model training so as to obtain a breast tumor cutting risk prediction model.
According to the invention, the historical breast tumor cutting area data is used as input data and the historical cutting risk label data is used as output data, and the association relation between the two is established, so that the training model is facilitated to understand the corresponding relation between different cutting area characteristics and cutting risks, and the accuracy and reliability of the prediction model are improved. The method comprises the steps that a mapping relation between a breast tumor cutting area and a cutting risk is established based on a preset convolutional neural network algorithm, the convolutional neural network can effectively capture a complex nonlinear mapping relation between the breast tumor cutting area and the cutting risk, so that the model extracts features in high-dimensional image data, the features are effectively associated with risk labels, and the accuracy and reliability of prediction are improved. The breast tumor cutting risk training set is transmitted to a breast tumor cutting risk prediction model framework for model training, and the trained model can process new breast tumor image data in real time, rapidly and accurately predicts the cutting risk of a patient, assists in making more accurate decisions before an operation, and improves the success rate of the operation and the safety of the patient.
In the embodiment of the invention, the historical breast tumor cutting area data and the historical cutting risk label data stored before are read by using a pandas library, and the cutting risk label is converted into a numerical type by using LabelEncoder in sklearn. The data set is divided into a training set, a verification set and a test set according to the ratio of 8:1:1 by utilizing the train_test_split function of the sklearn. Model_selection module, the image data is enhanced, random rotation, overturning, scaling and other operations are realized by using a albumentations library, and the data diversity is increased. Custom Dataset classes are created, inherit from torch.utes.data.Dataset, and implement __ getitem __ and __ len __ methods to design a breast tumor cutting risk training set. The model architecture adopts ResNet as a basic network, pre-training weights are loaded through torchvision.models, the last full-connection layer of ResNet is modified to enable the output of the model architecture to be matched with the number of cutting risk categories, an attention mechanism is added, an SE (sequential-and-specification) module is realized, attention of important characteristics of the model is enhanced, a FPN (Feature Pyramid Network) structure is realized, and multi-scale characteristics are extracted. Several custom convolution layers and full connection layers are added after the FPN for risk prediction. Regularization was achieved using nn. Dropout, preventing overfitting. selecting an appropriate loss function, such as nn. Cross EntropyLoss for multi-classification problems, or nn. BCEWITHLogitLoss for multi-label classification, defining an optimizer, such as torch. Optim. Adam, and dynamically adjusting the learning rate using a learning rate scheduler, torch. Optim. Lr_scheduler. ReduceLROnPlateau, printing model structures using torchsummary libraries, checking parameters and computational complexity, to build a breast tumor cutting risk prediction model architecture. And loading the preprocessed training set and verification set data. Defining training functions, including forward propagation, loss calculation, back propagation and parameter updating steps, implementing verification functions for evaluating model performance on a verification set, creating progress bars using tqdm libraries, visualizing the training process. In the training cycle, each epoch traverses the entire training set and updates the model parameters after each mini-batch. And evaluating the model on the verification set regularly, and recording indexes such as loss, accuracy and the like. And by using tensorboard to record various indexes in the training process, visual analysis is convenient, an early stopping mechanism is realized, training is stopped when the performance of the verification set is not improved any more, and overfitting is prevented. After training, evaluating the performance of the model on a test set, calculating the accuracy, precision, recall and F1 score, generating an confusion matrix and an ROC curve by using a sklearn. Metrics module, comprehensively evaluating the model performance to obtain a breast tumor cutting risk prediction model, and converting the model into TorchScript format by using a torch. Jit. Script so as to be deployed in a production environment.
The application has the beneficial effects that the data intelligent analysis system before breast tumor resection can accurately identify and extract the target area of breast tumor by the target area segmentation and feature extraction technology of breast tumor image data, and provides necessary data support for subsequent growth trend analysis and treatment decision. And comprehensively analyzing the characteristics of the nuclear division index and the metastasis mode of the multisource breast tumor, can evaluate the growth rate and the diffusion tendency of the tumor, provide predictive information, and help the multisource breast tumor to adjust the treatment strategy and prevent postoperative recurrence. And quantifying the operation risk and providing decision support according to the breast tumor growth trend data and the image analysis result, thereby achieving the treatment selection based on data driving. The prognosis breast tumor cutting risk prediction data and prognosis breast tumor treatment auxiliary decision feedback personalized prognosis breast tumor cutting information help doctors optimize the operation scheme and manage postoperative recovery of patients, and the later recovery condition of the patients is improved. The method realizes unified quantitative analysis of the prognosis breast tumor by combining with multi-source breast tumor case information of patients, so that the image features and pathological features of the comprehensive breast tumor case information have better analysis effects on the form and pathological condition of the prognosis breast tumor, thereby providing accurate and efficient prognosis breast tumor treatment auxiliary decision.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.