CN120689588A

CN120689588A - Microarray chip image recognition method and system

Info

Publication number: CN120689588A
Application number: CN202510643188.XA
Authority: CN
Inventors: 于柏淼; 曹东明; 吕佳; 董洋; 牛倩; 胡迪; 贾岩
Original assignee: Beijing Micropixel Intelligent Technology Co ltd; Dansheng Beijing Medical Technology Co ltd
Current assignee: Beijing Micropixel Intelligent Technology Co ltd; Dansheng Beijing Medical Technology Co ltd
Priority date: 2025-05-19
Filing date: 2025-05-19
Publication date: 2025-09-23

Abstract

The present invention provides a method and system for recognizing microarray chip images, wherein the method includes: using a block target detection model to locate the block circumscribed rectangular frame of the array, the block target detection model is generated by training a predetermined target detection model using an image set of multiple array arrangements; using a spot semantic segmentation model to semantically segment the block circumscribed rectangular frame area, generate a spot image block mask map, and adjust the angle of the block circumscribed rectangular frame based on the projection result of the spot image block mask map; using the spot image block mask map to calculate the centroid coordinates and radius of each spot area within the adjusted block circumscribed rectangular frame, and finally determine the circumscribed circle of each spot image block. The present invention not only accurately identifies the position of the block, but also accurately aligns the spot, while significantly improving analysis efficiency and shortening analysis time.

Description

Identification method and system for microarray chip image

Technical Field

The present invention relates to the field of chip technologies, and in particular, to a method and a system for identifying images of a microarray chip.

Background

The microarray chip is characterized in that a large number of biological macromolecules such as nucleic acid fragments, polypeptide molecules, even tissue slices, cells and other biological samples are orderly solidified on the surface of a support (such as a carrier such as a glass slide and a nylon membrane) by adopting a light guide in-situ synthesis method or a micro sample application method and the like to form dense two-dimensional molecular arrangement, then react with target molecules in the marked biological samples to be detected, and the intensity of reaction signals is rapidly, parallelly and efficiently detected and analyzed by a specific instrument such as a laser confocal scanner or a charge coupled photographic camera, so that the number of the target molecules in the samples is judged. The microarray chip includes a gene chip, a protein chip, a cell chip, a tissue chip, etc., according to probes immobilized on the chip.

Image recognition is an important link of gene chip data analysis, and the accuracy of the image recognition directly influences the interpretation of subsequent bioinformatics and the understanding of biological phenomena. However, due to the complexity of the gene chip image itself, existing image recognition techniques often encounter significant challenges when dealing with images of complex shape, unclear boundaries, or severe background interference. For example, the array is provided with various array configuration files, the brightness of the calibration points is ensured in the array arrangement from the view of the image, if the calibration points are not lightened, the automatic positioning deviation is usually caused, the traditional algorithm is to directly position the Block by a projection method, and the scheme is often affected by various interferences such as noise, fences, pollution points and the like, so that the positioning is inaccurate. These problems not only reduce the accuracy of image recognition, but also lead to a significant reduction in the efficiency of data processing, thereby affecting the quality and reliability of the overall gene expression analysis.

Disclosure of Invention

The invention provides a method for identifying images of a microarray chip, which comprises the following steps:

Utilizing a Block target detection model to position a Block circumscribed rectangular frame of an array, wherein the Block target detection model is generated by training a preset target detection model by utilizing a plurality of image sets distributed in arrays, and each image set distributed in arrays comprises chip image data and corresponding target detection labels;

Performing semantic segmentation on the Block circumscribed rectangular frame region by using a Spot semantic segmentation model to generate a Spot image Block mask map, adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the Spot image Block mask map, wherein the Spot semantic segmentation model is generated by training a semantic segmentation network model by using a Spot image Block cut in the Block circumscribed rectangular frame region and a corresponding semantic segmentation label,

And calculating by using the spot image Block mask map to obtain the barycenter coordinate and the radius of each spot region in the adjusted Block circumscribed rectangular frame.

Optionally, training a predetermined target detection model by using image sets of multiple array arrangements to generate the Block target detection model includes:

adjusting the brightness and contrast of chip images arranged in a preset array, and forming JPG images of gray scale, red, green and pseudo color channels respectively;

and labeling the target detection label on the JPG image.

Optionally, the predetermined target detection model is a target detection model based on YoloV, the initial learning rate parameter of the target detection model based on YoloV is 0.01, and the loss function is an nn.bcewithlogitsloss function of the Python.

Optionally, adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the spot image Block mask map comprises projecting the spot image Block mask map from a preset angle range, calculating information entropy at the same time, determining that the angle corresponding to the minimum information entropy is the offset angle of the Block, and rotating the Block circumscribed rectangular frame by the offset angle under the condition that the offset angle of the Block is larger than a threshold value.

Optionally, adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the spot image Block mask map includes locating the final position of the Block circumscribed rectangular frame in a mode of matching a standard template constructed by using the attribute of the Block with the spot image Block mask map.

Optionally, the standard template is constructed by using the attribute of the Block, including constructing the standard template by using the row and column numbers and the row and column spacing of the Block.

Optionally, the method further comprises determining a circumcircle of each spot image block according to the centroid coordinates and the radius.

The invention provides a micro array chip image identification system, which comprises:

The Block positioning unit is used for positioning a Block external rectangular frame of the array by using a Block target detection model, the Block target detection model is generated by training a preset target detection model by using a plurality of image sets arranged in arrays, and each image set arranged in arrays comprises chip image data and corresponding target detection labels;

The Spot semantic segmentation unit is used for carrying out semantic segmentation on the Block circumscribed rectangular frame area by utilizing a Spot semantic segmentation model to generate a Spot image Block mask map, wherein the Spot semantic segmentation model is generated by training a semantic segmentation network model by utilizing a Spot image Block cut in the Block circumscribed rectangular frame area and a corresponding semantic segmentation label;

The Block positioning adjustment unit is used for adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the spot image Block mask map;

The Spot region identification unit is used for calculating and obtaining the barycenter coordinates and the radius of each Spot region in the adjusted Block circumscribed rectangular frame by using the Spot image Block mask map.

The invention relates to an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the identification method of the microarray chip image when executing the program.

The present invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the aforementioned method of identifying images of a microarray chip.

Through the technical scheme provided by the invention, the positions of the blocks can be accurately identified, the spots can be accurately aligned, meanwhile, the analysis efficiency is greatly improved, and the analysis time is shortened. According to the scheme, a Block positioning model based on target detection and a Block and Spot recognition scheme based on combination of a semantic segmentation model and a traditional image algorithm are introduced, manual adjustment is not needed basically after alignment, and the alignment effect is highly consistent with the comparison of manual alignment standards. The invention not only saves labor cost and improves efficiency, but also can realize automation from positioning Block to aligning Spot and then extracting the whole data system.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of main steps of a method for identifying an image of a microarray chip according to the present invention.

Fig. 2 is a flowchart of a specific process of the identification method of the microarray chip image provided by the present invention.

Fig. 3a and 3b are schematic diagrams of a chip image of 1 column and 4 rows and a chip image of 2 columns and 7 rows, respectively, which need to be identified in the present invention.

Fig. 4a and 4b are respectively circumscribed rectangular boxes identified by the block target detection model.

Fig. 5 is a block identification result after accurate positioning.

Fig. 6 is a process variation of the image of the microarray chip in the identification method of the image of the microarray chip according to the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides an innovative technical scheme, namely, the identification of the Block and the Spot of the microarray chip image adopts the technical scheme of combining an AI model with a traditional image algorithm, so that the position of the Block can be accurately identified, the Spot can be accurately aligned, the analysis efficiency is greatly improved, and the analysis time is shortened. The present invention will be described in detail with reference to the accompanying drawings.

Fig. 1 is a method for identifying an image of a microarray chip according to the present invention, as shown in fig. 1, the method comprising the steps of:

S1, positioning a Block external rectangular frame of an array by using a Block target detection model, wherein the Block target detection model is generated by training a preset target detection model by using a plurality of image sets distributed in arrays, and each image set distributed in arrays comprises chip image data and corresponding target detection labels;

S2, carrying out semantic segmentation on the Block circumscribed rectangular frame area by utilizing a Spot semantic segmentation model to generate a Spot image Block mask map, adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the Spot image Block mask map, wherein the Spot semantic segmentation model is generated by training a semantic segmentation network model by utilizing a Spot image Block cut in the Block circumscribed rectangular frame area and a corresponding semantic segmentation label,

S3, calculating and obtaining the barycenter coordinates and the radius of each spot region in the adjusted Block circumscribed rectangular frame by using the spot image Block mask map.

The Block target detection model and the Spot semantic segmentation model are both AI models, a pytorch framework is adopted for training, and the novel idea combining the AI model with the traditional image algorithm is adopted.

The identification method of the microarray chip image provided by the invention comprises two stages of Block positioning and Spot identification, as shown in fig. 2, the specific process comprises the following steps:

S01, training a Block target detection model by adopting a YoloV network structure;

S02, training a Spot semantic segmentation model by adopting a UNet++ network structure;

S03, inputting the acquired Tif image of the related microarray chip into a Block target detection model, and outputting external rectangular coordinates of the Block;

s04, intercepting a region image of each Block, inputting a Spot semantic segmentation model, and segmenting to obtain a region and a category of each Spot;

s05, determining the deflection angle of the Block according to the binary image of the Spot, namely, calculating the angle of the Block by using a projection method, namely, rotating in the range of 0-180 degrees, calculating the information entropy once every 0.2 degrees projection, calculating the angle corresponding to the minimum information entropy after traversing, and determining the angle as the deflection angle of the Block.

S06, rotating the Block according to the deflection angle when the deflection angle of the Block is larger than a threshold value, otherwise, not rotating;

s07, establishing a standard template by using the attribute of the Block, and matching and calculating the binary image with the standard template to obtain the accurate position of the Block.

S08, after the Block is precisely positioned, calculating the distance between the center coordinates of each Circle in the Block and X, Y of the Spot centroid calculated by the Spot semantic segmentation model, and if the distance is within a threshold range, assigning X, Y calculated by the Spot semantic segmentation model to the X, Y value of Circle.

S09, traversing and calculating a threshold value in the Local area of each Spot, and calculating Cricle center coordinates and radii.

The image of the microarray chip is identified by first locating the microarray chip accurately at the location of the array of spots in the microarray chip, and arranging various array configuration files in the array. Because the traditional algorithm is directly positioned to the block position through a projection method, the brightness of the calibration points is ensured due to the need of array arrangement, if the calibration points are not lightened, automatic positioning deviation is usually caused, and the scheme is often affected by various interferences such as noise, fences, pollution points and the like to cause inaccurate positioning. In order to accurately and precisely position the system to the positions of various lattice areas (blocks) in the microarray chip image, yolov is adopted as a specific embodiment to serve as a target detection deep learning network, and a Block target detection model is generated through training of multiple arranged image data sets. The model makes the condition that the four corners must be all bright unnecessary, mainly to let the model recognize models of various arrays.

Collecting and sorting chip image data and labeling target detection labels, taking a chip image of 1*4 micro array chip images (figure 3A) for training a Block positioning model as an example, selecting 1 column and 4 rows of chip images in 40 TIFF format, automatically adjusting brightness and contrast of each TIFF image, respectively generating jpg images of gray scale, red, green and pseudo color channels after adjustment, labeling target detection labels on jpg, obtaining 160 label data, dividing the data into a training set and a verification set according to 8:2, performing 100 rounds of iterative training to obtain the Block target detection model, performing model training and optimization in a multi-round iterative mode until a detection model capable of accurately positioning a Block region is obtained, and verifying the obtained Block target detection model by additionally taking 50 micro array chip images of 1*4, wherein the result shows that the Block detection rate is 100%.

In order to locate the Block region on the microarray chip image (fig. 3B) of 2*7, a high-quality 2*7 microarray image with certain data is added in the training set to fine-tune the existing Yolov-based Block target detection model, so that the target detection model capable of locating the Block region on two microarray images can be obtained. In order to enable the model to identify the Block areas on more types of arrays, microarray images with other arrangements such as 4X2,2X5 are further added, and finally 100 TIFF images of a training sample are divided into a training set and a verification set according to 8:2, and 100 rounds of iterative training are performed to obtain the Block target detection model. After the target detection is completed, the Box position is the initial position of Block positioning, and the identification result of the Block target detection model is the circumscribed rectangular frame of the positioning array.

Taking 1×4 and 2×7 microarray chip images as an example, the Block target detection model obtained by the above model can locate the circumscribed rectangular frame of the microarray lattice area (Block), and the results are shown in fig. 4a and 4 b.

It is worth specifically describing that, when training the Block target detection model based on YoloV, the input image size is set to 1024 x 1024, the initial learning rate is 0.01, the loss function is the nn.BCEWITHLogitLoss function of Python, and the training effect is the best through experimental verification.

After the target detection is completed, the Box position is given as the initial position of the Block positioning. The calculation of the Block angle before microarray Spot alignment is critical because if the Block angle is offset and uncorrected, the Spot rows and columns may be completely disordered, which is very apparent on high density large arrays. After target detection and positioning, the method for calculating the Block angle comprises the following steps of carrying out semantic segmentation on image blocks in an area by using a Spot semantic segmentation model in the area to obtain a binary image, carrying out projection based on the binary image, and simultaneously calculating information entropy from 0-180 degrees of projection, wherein the corresponding angle when the entropy is minimum is the offset angle of the Block. After the angle calculation is completed, rotating the graph of the Block by the angle when the Block angle deviation is larger than the threshold value, and then finding the final accurate position of the Block by utilizing the mode that the templates constructed by the Block row number and the line-column spacing are matched with the binary graph, as shown in fig. 5.

The method comprises the steps of projecting the spot image Block mask map from a preset angle range, calculating information entropy at the same time, determining that the angle corresponding to the minimum information entropy is a Block offset angle, and rotating the Block circumscribed rectangular frame by the offset angle under the condition that the Block offset angle is larger than a threshold value. The method comprises the steps of adjusting the angle of a Block circumscribed rectangular frame based on the projection result of a spot image Block mask map, and locating the final position of the Block circumscribed rectangular frame in a mode of matching a standard template constructed by utilizing the attribute of the Block with the spot image Block mask map.

In the invention, as a specific implementation manner, as shown in fig. 6, a unet++ semantic segmentation model is used for carrying out semantic segmentation on image blocks in a Block to obtain a binary image, projection is carried out based on the binary image, information entropy is calculated at the same time by projecting every 0.2 degrees within the range of 0-180 degrees, and the corresponding angle when the entropy is minimum is the offset angle of the Block. After the offset angle calculation is completed, rotating the Block image Block by the angle, then establishing a standard template by utilizing the attribute of the Block, and finding the accurate position of the Block by adopting a template matching mode. Taking 1×4 and 2×7 microarray chips as an example, the rotated Block is shown in fig. 4.

After Block accurate positioning, the accuracy of Spot recognition directly influences the numerical extraction of each final signal point, and the Spot recognition based on semantic segmentation has the advantages that the rough Spot region can be well segmented for various signal points, the categories of signals can be given, namely good points, bad points and impurity points, and the impurity points in the whole Tiff image comprise two categories, namely dust and fence marks.

The method comprises the steps of constructing a Spot semantic segmentation model, firstly selecting unet++ as a semantic segmentation network, then cutting out Spot image blocks from a block area with adjusted front angles in a sliding window mode, selecting the Spot image blocks to construct training set data, labeling semantic segmentation labels, selecting 1 column 4 rows of chips in 52 tiff formats and 2 columns 7 rows of chip data, generating jpg images of gray scale, red, green and pseudo color channels in the same way, cutting out each jpg image in a sliding window size of 500 x 500 pixels and a step size of 380 pixels, selecting 3595 image blocks from the cut image blocks to label the semantic segmentation labels in consideration of sample diversity and balance, obtaining 3595 labeled data altogether, dividing the labeled data into a training set and a verification set according to a ratio of 8:2 in the training process, performing 300 round iterative training, performing multi-round iterative optimization, obtaining the Spot semantic segmentation model, testing on 50 tiff test sets, and obtaining a model pixel accuracy coefficient of about 98. 86.01%.

After the shot semantic segmentation model is constructed, calculating the centroid and the radius of each shot region by using a mask map of each shot. The center radius is obtained by calculating the center of circle, radius, circularity and other parameters of the Spot on the original 16-bit tiff data through image morphology calculation on the signal point areas separated by the Spot semantic segmentation model, whether the next step is carried out is judged according to the category of the signal, the good points and the bad points are calculated, and the impurity points are not calculated in the next step. And calculating the distance between the initial X, Y position of the Circle centroid of the Block and the X, Y position calculated by the Spot semantic segmentation model, and if the distance is smaller than a threshold range, assigning X, Y coordinates calculated by the Spot semantic segmentation model to an initial X, Y value of the Circle centroid. Because not all Crrcle in the Block can be detected by using the Spot semantic segmentation model, some positions are exactly the same as the background, and for weak points and background points, the initial X, Y position of the centroid of Circle is reserved. The above threshold range is calculated using a conventional Otsu algorithm, and as a specific embodiment, a segmentation threshold range of the 16-bit Tiff image is calculated.

Because the final Spot recognition result is the circumscribed circle of the Spot to be calculated, under the condition that the Spot semantic segmentation model is trained and can be used for carrying out Spot recognition, the circumscribed circle of the Spot is obtained according to X, Y coordinates of the Spot centroid obtained by the Spot semantic segmentation model.

Through the specific description, the method comprises the steps of firstly carrying out image projection algorithm and template matching algorithm calculation by utilizing an reasoning result to obtain accurate positions and angles of blocks, then carrying out morphological calculation by utilizing a semantic segmentation reasoning result to obtain centers, radiuses and categories of the spots, and calculating the final accurate centers and radiuses of the spots by utilizing an otsu threshold algorithm set of a traditional image algorithm.

Preferably, the system further comprises an assignment determining unit, wherein the assignment determining unit is used for calculating the distance between the center coordinates of each Circle in the Block and the centroid X, Y of the Spot calculated by the Spot semantic segmentation model, the distance is in a threshold range, and the centroid X, Y of the Spot calculated by the Spot semantic segmentation model is assigned to a X, Y value of the Circle center.

Preferably, the system further comprises an circumscribed circle final determining unit for traversing a threshold range within the Local area of each Spot, calculating Cricle the center coordinates and the radius.

The invention can realize the automation of the whole system from positioning Block, aligning Spot and extracting the image data of the microarray chip.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.

Claims

1. A method for identifying an image of a microarray chip, comprising:

2. The method for recognizing an image of a microarray chip according to claim 1, wherein,

Training a predetermined target detection model by using image sets of various array arrangements to generate the Block target detection model, including:

and labeling the target detection label on the JPG image.

3. The method for identifying a microarray chip image according to claim 1, wherein the predetermined target detection model is a YoloV-based target detection model, an initial learning rate parameter of the YoloV-based target detection model is 0.01, and the loss function is a Python-carried nn.

4. The method for identifying the microarray chip image according to claim 1, wherein the step of adjusting the angle of the Block circumscribed rectangle frame based on the projection result of the spot image Block mask map comprises the steps of projecting the spot image Block mask map from a predetermined angle range, calculating information entropy at the same time, determining that the angle corresponding to the minimum information entropy is the offset angle of the Block, and rotating the Block circumscribed rectangle frame by the offset angle under the condition that the offset angle of the Block is larger than a threshold value.

5. The method for recognizing the microarray chip image according to claim 4, wherein the adjusting the angle of the Block circumscribed rectangular frame based on the projection result of the spot image Block mask map comprises locating the final position of the Block circumscribed rectangular frame in a manner that a standard template constructed by using the attribute of the Block is matched with the spot image Block mask map.

6. The method of claim 5, wherein constructing a standard template using the Block attributes comprises constructing a standard template using the Block row and column numbers and the row and column spacing.

7. The method of claim 1, further comprising determining a circumscribing circle for each spot image block based on the centroid coordinates and the radius.

8. A system for identifying images of a microarray chip, the system comprising:

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of identifying images of a microarray chip according to any one of claims 1 to 7 when the program is executed by the processor.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of identifying a microarray chip image according to any of claims 1 to 7.