CN119831936A

CN119831936A - Chip surface defect detection method and system based on improvement YOLOv8

Info

Publication number: CN119831936A
Application number: CN202411824367.5A
Authority: CN
Inventors: 唐武海; 陈华; 夏红杰; 贾玉博; 吴文豪
Original assignee: Nanhu Research Institute Of Electronic Technology Of China
Current assignee: Nanhu Research Institute Of Electronic Technology Of China
Priority date: 2024-12-11
Filing date: 2024-12-11
Publication date: 2025-04-15

Abstract

The present invention provides a chip surface defect detection method and system based on improved YOLOv8. The method comprises the following steps: (1) constructing a chip surface defect image data set; (2) dividing the obtained data set into a training set, a verification set and a test set; (3) constructing an improved YOLOv8 network model; (4) inputting the data set into the improved YOLOv8 network model for training and prediction to obtain a chip image containing surface defects. The improved YOLOv8 model of the present invention can identify defects of different shapes and sizes through DCNv4 variable convolution, thereby enhancing the feature extraction capability of the network; using the Dysample upsampling operator to reduce the loss of defect edges and detail information, thereby retaining more feature information; using the MPDIOU boundary loss function to replace the original loss function, thereby accelerating the convergence speed of network training and improving the model's defect positioning capability. Compared with the original model, the detection effect of the present invention is improved, and can be better applied to chip surface defect detection tasks.

Description

Chip surface defect detection method and system based on improvement YOLOv8

Technical Field

The invention relates to the field of target detection, in particular to a chip defect detection method and system based on improvement YOLOv (automatic repeat request) 8.

Background

In the process of mass production of chips, because the chip manufacturing process is quite complex, each link may cause defects on the surface of the chip, such as edge breakage, scratches and other small defects, which often exceed the direct observation range of human eyes and are not easy to detect, so that the detection is very difficult, and the quality of the chip is seriously affected to a certain extent.

The traditional chip defect detection method utilizes manpower to carry out quality inspection, and the mode is time-consuming and labor-consuming, and has poor detection effect. With the increasing maturity of computer vision and neural network technology, defect detection systems have widely adopted target detection algorithms. Currently, the target detection algorithms based on deep learning can be divided into two types, two-stage algorithms and single-stage algorithms. Two-stage algorithms (e.g., R-CNN, fast R-CNN, and Fast R-CNN) first generate candidate regions, and then classify and regress the objects. The single-stage algorithm, represented by the SSD and YOLO series, uses only convolutional neural networks to extract features and immediately classify and locate objects. Compared with a two-stage algorithm, the single-stage algorithm has simpler structure, omits the step of generating candidate regions, directly predicts the position and the category of the target from the image, realizes faster detection speed while maintaining high precision, and is suitable for real-time detection application.

YOLOv8 as one of the most excellent single-stage target detection algorithms, is excellent in detection speed and accuracy, and can effectively identify various defects on the surface of the chip. However, there are some limitations in identifying complex fine features, such as poor effect of identifying the blocked defects and poor positioning capability of the target defects, so that the phenomena of missing detection and false detection are easy to occur, and the requirement of high-precision detection is difficult to meet.

Disclosure of Invention

The invention aims to provide the chip surface defect detection method and the chip surface defect detection system based on the improvement YOLOv, so that the detection accuracy and the detection speed are higher on industrial chip defect detection equipment, the chip surface defect detection method and the chip surface defect detection system can be more easily deployed in a defect diagnosis analysis system, and an inspector can rapidly identify a chip containing defects according to detection results.

The invention provides a chip surface defect detection method based on an improvement YOLOv, which comprises the following specific steps:

s1, constructing a chip surface defect picture data set;

s2, dividing the obtained data set into a training set, a verification set and a test set;

S3, improving based on YOLOv model, fusing DCN and C2f to form a new C2f-DCN variable feature fusion module, extracting features, introducing a dynamic upsampling operator Dysample, and constructing an improved chip surface defect detection model by adopting a MPDIOU boundary loss function, wherein the improved chip surface defect detection model is an improved YOLOv model;

S4, inputting the chip surface defect dataset into the improved YOLov model for training, and optimizing and improving the detection precision and generalization capability of the YOLov model by improving the capability of extracting the chip surface defect characteristics, reducing the loss of detail characteristic information in an up-sampling stage, enhancing the positioning of target defects;

S5, testing YOLOv and the improved YOLOv8 model by using a test set, comparing experimental results, and verifying the accuracy and efficiency of the test set and the improved YOLOv model in chip surface defect detection, wherein the accuracy and efficiency comprise defect positions, defect types and confidence.

In step S1, a chip surface defect picture dataset is constructed, comprising:

(1.1) obtaining a chip surface defect image, namely obtaining a chip image with defects in the process of producing the chip, and dividing the chip image into three types of pin missing, surface scratch and pin bending according to types;

(1.2) image data enhancement, which is to expand the original data set and enhance it by using the image enhancement method

A diversity;

and (1.3) marking the image, namely marking the defect area by using a LabelImg image marking tool, and marking the defect area by using a frame.

In step S3, an improved chip surface defect detection model is constructed, comprising:

(3.1) configuring an environment, namely establishing a virtual environment by using Anaconda, and debugging YOLov source codes of a model;

Inputting YOLOv a constructed chip surface defect data set into a network structure model of YOLOv, wherein the data input comprises the steps of obtaining chip surface defect images, chip surface defect image labels, YOLOv network structure configuration files and YOLOv model pre-training weights;

(3.3) configuring super parameters, namely setting the size, learning rate, batch processing number and iteration times of the network input image;

(3.4) adopting a YOLOv original model, and improving the YOLOv network structure model.

In step (3.4), the improvement of YOLOv network architecture model includes:

(3.4.1) fusing DCN and C2f to form a novel C2f-DCN variable characteristic fusion Module

Extracting row characteristics;

(3.4.2) modifying the original upsampling module to Dysample;

(3.4.3) replacing the original loss function with MPDIoU boundary loss functions.

In the step (3.4.1), the DCN and the C2f are fused to form a new C2f-DCN variable feature fusion module for feature extraction. DCNv4_net generates a mask (mask) by depth-wise calculation and calculates an offset (offset) using a point-wise operation. Finally, the network is able to output for each pixel its displacement in the horizontal and vertical directions as well as the shape variation parameters. The parameter weights can be adaptively adjusted to better match the feature locations in the input data, thereby significantly improving the detection performance of the model for irregular defects.

Further, the step of calculating the offset and the shape change parameter of the pixel point in the step (3.4.1) is as follows:

Two convolution kernels are arranged in a C2f module Bottleneck of (3.4.1.1) YOLOv8, the sizes of the two convolution kernels are 3*3, the size of a feature diagram output by the two convolution kernels is unchanged, 3*3 convolution kernels also exist in a DCNv4_Net structure, required offset and change parameters can be calculated under the condition that the size of the feature diagram is not changed, and two 3*3 convolutions in the C2f module are replaced by DCNv4_Net to obtain C2f-DCNv4;

(3.4.1.2) inputting features into C2f-DCNv4, wherein the features are firstly divided into two parts according to the number of channels by a layer 1*1 convolution kernel, a layer Split feature segmentation layer is used for splitting the features into two parts according to the number of channels, a part waits for feature splicing Concat, and the other part is input into a DCNv structure, wherein the features pass through two DCNv4 convolution layers, the DCNv4 convolution layer is a separable convolution formed by combining 3*3 convolution and a linear projection, the original convolution weight wk is separated into a depth part and a point-by-point part, the depth part is responsible for an original position sensing modulation scalar mk, the point-by-point part is a projection weight w shared between sampling points, and the features pass through the separable convolution to obtain an abscissa offset and an offset adjustment scale of the sampling points:

In the formula, G is the number of aggregation groups, wg epsilon RC×C ^′,C^′ =C/G is the projection weight of different positions for the G group, K is the number of sampling points, m _gk epsilon R is the modulation factor of the kth sampling point, and the model operation speed is accelerated by discarding softmax normalization operation by DCNv, unlike DCNv 3.

(3.4.1.3) Normalized BatchNorm d operation after feature output containing offset and scale of adjustment obtained by DCNv4 _net:

In the formula, x is input characteristic value, ex represents mean value, var x represents variance, two parameters can be directly calculated in forward propagation, gamma and beta are adjustment variance and adjustment mean value, initial value is 1 and 0 respectively, the two parameters are continuously learned and updated in backward propagation of model, epsilon is used for ensuring calculation stability, avoiding 0 occurrence of denominator, and defaulting to 1E-5.

(3.4.1.4) Using RELU function to activate the output result, and completely outputting the input value greater than 0 and the input value smaller than 0 to be 0.

In step (3.4.2), the original upsampling module is modified to Dysample, and the linear projection is used for upsampling Dysample to generate s ² offset sets, new sampling points are constructed, and the input feature map is resampled by bilinear interpolation.

Further, the processing of the Dysample upsampling module in step (3.4.2) includes:

Firstly, given an up-sampling scale factor S and a feature map X with the size of C×H×W, carrying out bilinear interpolation on the input feature map X, generating an offset O with the size of 2S ² ×H×W by using a linear layer with an input channel of C and an output channel of 2S ², controlling the offset O by a dynamic range factor, then reshaping the size of the offset O into 2×sH×sW by pixel transformation, adding the offset O with an original sampling grid G to output a sampling set S, resampling the feature map X by the position of the sampling set S by a grid sampling function, and generating the up-sampling feature map X with the size of C×sH×sW.

In step (3.4.3), the original loss function is replaced with MPDIoU boundary loss functions. The MPDIoU loss function comprehensively considers the intersection ratio, the center point distance and the width-height deviation of the prediction frame and the real frame, takes the left lower corner and the right lower corner coordinates of the prediction frame as input, defines a rectangle in a specific mode, enables the distance between two points in the prediction frame and the real frame to be minimum, and can accelerate regression convergence of a network to the prediction frame, so that a more accurate prediction result is obtained. The calculation of MPDIoU loss functions is defined by the following formula:

Wherein B _pre denotes the coordinates of the prediction bounding box, B _gt denotes the coordinates of the actual labeling bounding box, x ₁,y₁ denotes the upper left-hand corner coordinates of the box, x ₂,y₂ denotes the lower right-hand corner coordinates of the box, For the distance between the top left corner and the bottom right corner between the prediction bounding box and the actual labeling bounding box, w, h is the width and height of the chip picture.

In another aspect, the present invention further provides a chip surface defect detection system based on the improvement YOLOv, which is characterized by comprising the following modules:

(1) The data set establishing module is used for establishing a chip surface defect picture data set;

(2) The data set dividing module is used for dividing the obtained data set into a training set, a verification set and a test set;

(3) The model building module is used for building an improved YOLOv network model;

(4) And the training and predicting module is used for inputting the data set into the improved YOLO network model to perform training and predicting to obtain a chip image containing the surface defects.

The invention also provides computer equipment, which comprises a camera, a memory and a processor, wherein the camera acquires chip images with surface defects, the internal memory provides an operating environment for an operating system and application programs, the external memory is used for storing the operating system and the computer programs, and the processor is used for running the computer programs in the steps S1-S5.

Compared with the prior art, the method has the remarkable advantages that the improved YOLOv model can identify defects of different shapes and sizes through DCNv variable convolution, the characteristic extraction capacity of a network is enhanced, the loss of defect edges and detail information is reduced by using a Dysample up-sampling operator, more characteristic information is reserved, the original loss function is replaced by using a MPDIOU boundary loss function, the convergence speed of network training is increased, and the positioning capacity of the model on the defects is improved. Compared with the original model, the method has the advantages that the detection effect is improved, and the method can be better applied to chip surface defect detection tasks.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for detecting a chip surface defect based on a modification YOLOv according to an embodiment of the present invention;

FIG. 2 is a diagram of YOLOv network architecture used in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a network structure of DCNv to 4 in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a network structure of Dysample in an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized below, may be had by reference to the appended drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be embodied in many different forms and modifications may be made by one skilled in the art without departing from the spirit or essential characteristics thereof, and it is therefore not limited to the specific embodiments disclosed herein.

As shown in fig. 1, the chip surface defect detection method based on the improvement YOLOv according to the embodiment of the present invention includes:

step 1, constructing a chip surface defect picture data set, which comprises the following specific steps:

(1.1) acquiring a chip surface defect image. The chip surface defect image used in the invention is chip surface defect image data obtained by shooting in the chip production process of a certain chip enterprise. The dataset contains 9880 images in total, with a resolution of 1280 x 1080, in RGB format. These images contain three types of defects in total, pin missing, surface scratches, and pin bending.

(1.2) Image data enhancement to enhance the robustness of the model and to enhance the generalization ability of the model, image inversion, random scaling and Mosaic data enhancement techniques are applied to expand the original data set, enhancing its diversity. The data enhancement technology can help the model learn wider characteristic representation, reduce the over-fitting phenomenon, and improve the capability of the model for detecting the surface defects of the chip in different scenes, thereby improving the industrial production efficiency and quality.

And (1.3) marking the image, namely marking the image by using a LabelImg image marking tool, calibrating an actual defect area picture frame, and calibrating a defect area by using a plurality of anchor frames for the image containing a plurality of defects. The missing pin label is 0 (queshi), the surface scratched label is 1 (huaheng), and the bent pin label is 2 (wanqu). And after the labeling is completed, the generated json format data is stored, and the generated json format data is converted into txt format data which can be processed by the YOLO network.

And 2, dividing the data set, namely dividing the chip surface defect image data amplified by the sample and enhanced by the data and the corresponding label file into a training set, a testing set and a verification set according to the proportion of 8:1:1.

And 3, constructing a chip surface defect detection model based on the improved YOLOv model.

(3.1) Environment configuration, downloading the official source code of YOLOv model from Github of Ultralytics official, building virtual environment by using Anaconda, installing the following libraries according to the requirement file in YOLOv source code ：matplotlib>＝3.3.0、numpy>＝1.22.2、opencv-python>＝4.6.0、pillow>＝7.1.2、pyyaml>＝5.3.1、requests>＝2.23.0、scipy>＝1.4.1、tqdm>＝4.64.0、torch>＝1.8.0、torchvision>＝0.9.0.

And (3.2) inputting data, namely inputting the divided chip surface defect training set and test set into YOLOv network models, wherein the data comprise chip surface defect images, corresponding labels, YOLOv network model pre-training weights and YOLOv network structure configuration files. The network structure of YOLOv is shown in fig. 2.

(3.3) Setting super parameters, namely setting a learning rate, the number of batch samples, the number of iterations, the number of image channels, the picture input size and an optimizer, wherein the initial learning rate is 0.01, the minimum learning rate is 0.001, the weight attenuation is 0.0005, the number of batch samples batchsize is 16, the number of iterations is 300, the number of image channels is 3, the picture input size is 640 x 640, and the optimizer is AdamW.

(3.4) Model improvement, wherein the original model adopted by the invention is YOLOv8, and the YOLOv network structure model is particularly improved in the following three ways.

(3.4.1) Feature extraction is performed by fusing DCN and C2f to form a new C2f-DCN variable feature fusion module:

(3.4.1.1) introducing DCNv4 variable convolution into Bottleneck in the C2f module to replace the original common convolution to form C2f-DCNv4;

(3.4.1.2) As shown in FIG. 3, a schematic representation of the structure of C2f-DCN formed by the fusion of DCNv and C2f is shown. Features in DCNv4 structures are first convolved through a layer 1*1, then Split feature segmentation layers divide the features into two parts according to channel numbers, one part is transferred into DCNv structures and then waits for feature stitching Concat, the other part is transferred into DCNv structures, in the structures, the features are convolved through a layer 3*3 and a DCNv4 convolution layer, the DCNv convolution layer is separable convolution formed by combining 3*3 convolutions and a linear projection, original convolution weights wk are separated into depth parts and point-by-point parts, the depth parts are responsible for original position perception modulation scalar mk, the point-by-point parts are projection weights w shared between sampling points, and feature pixels p ₀ can be output through the separable convolution:

In the formula, G is the number of aggregation groups, for the G group, wg epsilon RC×C ^′,C^′ =C/G is the projection weight of different positions, K is the sampling point number, mgk epsilon R is the modulation factor of the kth sampling point, and the method is different from DCNv3 in that DCNv4 cancels the last softmax normalization, reduces unnecessary calculation and accelerates the calculation speed of the model;

(3.4.1.3) normalized BatchNorm2d operation on the feature output from DCNv 4:

In the formula, x represents the characteristic value of the input, ex represents the mean value thereof, and Var x represents the variance. The two parameters can be directly calculated in the forward propagation process, gamma and beta are used for adjusting variance and mean, initial values of the gamma and beta are 1 and 0 respectively, the two parameters are continuously learned and updated in the training reverse propagation process, epsilon is used for guaranteeing the stability of calculation, the situation that 0 occurs in denominator is avoided, and the default is 1e-5.

(3.4.2) Modifying the original upsampling Module to Dysample

In order to improve the capturing capability of the chip surface defect detection model to target defect characteristics and details, as shown in fig. 4, the invention generates s ² offset sets by using linear projection through efficient Dysample up-sampling, constructs new sampling points, and resamples an input characteristic diagram by using a bilinear interpolation method, thereby improving the capturing capability of the chip surface defect detection model to target defect characteristics and details, reducing reasoning cost and improving the reliability of the up-sampling process.

The processing procedure of the Dysample up-sampling module comprises:

In the grid sampling process, a feature map X of size c×h ₁×W₁ and a sample set S of size 2×h ₂×W₂ are first given, where the subscripts of H and W denote the coordinates of X and y. The grid sampling function resamples the hypothetical bilinear interpolation X to X ^′ and the size C X H ₂×W₂ using the position of the sample set S, where the sampling function is defined as follows:

X^′=grid_sample(X,S),

In specific implementation, the up-sampling scale factor S and the feature map X with the size of c×h×w are given first, the linear layer with the number of input and output channels of C and 2gs ² is used to generate the offset O with the size of 2gs ² ×h×w, the offset O is controlled by the dynamic range factor, then the size of the offset O is remodeled into 2×sh×sw by pixel transformation, the offset O is added with the original sampling grid G to output a sampling set S, and the position of the sampling set S is resampled on the feature map X by the network sampling function to generate the up-sampling feature map X ^′ with the size of c×sh×sw.

Dysample up-sampling generates a 'dynamic range factor' point by point through linear projection input features, and controls the dynamic scope within the size range of [0, 5] by introducing a sigmoid function and a static factor of 0.5, so that the flexibility of offset is further increased, and the offset is calculated by the following formula:

O=0.5sigmoid(linear₁(X))*linear₂(X)

In the present invention, instead of reconstructing the input features with upsampling kernels that generate content perception, dysample upsampling is performed in a manner that generates sample point locations, one point is acquired at each location and divided into s ² sample points, so long as s ² sample points can continue to be dynamically segmented, then only one point needs to be acquired for the entire upsampling process. In addition, the kernel weight of Dysample up-sampling is based on x and y as position conditions, only 2 channels of feature graphs are needed, and the sampling is more efficient.

(3.4.3) Substituting MPDIoU boundary loss functions for original loss functions

The MPDIoU loss function comprehensively considers the intersection ratio, the center point distance and the width-height deviation of the prediction frame and the real frame, takes the left lower corner and the right lower corner coordinates of the prediction frame as input, defines a rectangle in a specific mode, enables the distance between two points in the prediction frame and the real frame to be minimum, and can accelerate regression convergence of a network to the prediction frame, so that a more accurate prediction result is obtained. The calculation of MPDIoU loss functions is defined by the following formula:

Step 4, training using the modified YOLOv model, comprising:

(5.1) the hardware environment used by the invention is configured to be CPU, intel Core i9-12900K, CPU main frequency 3.2GHz, memory 64G,GPU:NVIDIA GeForce RTX 3080, video memory GDDR6X24G and software environment is configured to be deep learning Pytorch 1.12.1,Python 3.8,Cuda 11.3. The super parameters are set to be that the initial learning rate is 0.01, the minimum learning rate is set to be 0.001, the weight attenuation is 0.0005, the number of batch samples batchsize is 16, the iteration number is 300, the number of image channels is 3, the image input size is 640 x 640, and the optimizer is AdamW.

(5.2) In the model training process, to increase the network training speed, YOLOv pre-training weights are used for training, and the authorities provide several pre-training weights, and different versions can be selected for training according to different requirements. In the present invention, the pre-training weight selected is yolov8s.pt. The model is then validated for a validation set every 50 iterations. And secondly, the learning rate is adjusted, and the learning rate is dynamically adjusted according to the verification result so as to prevent over fitting and under fitting. The learning rate is gradually reduced using a learning rate decay strategy (LEARNING RATE DECAY). And finally, training until the preset iteration times are reached or the performance of the verification set is not improved, and completing the training of the network model.

And 5, testing and evaluating the improved YOLOv model.

Model test and evaluation, namely loading a weight file generated after model training into a test file to perform model test, obtaining a test result of the improved chip surface defect detection model on a test set, wherein the test result comprises defect positions, defect types and confidence coefficient, and comparing the test result with a YOLOv model.

The improved chip surface defect detection model is improved based on YOLOv model, a novel C2f-DCN variable feature fusion module is formed by fusion of DCN and C2f to conduct feature extraction, an original up-sampling module is modified to Dysample, and a MPDIOU boundary loss function is used for replacing an original loss function to construct the improved chip surface defect detection model.

The improved algorithm not only has better performance on the whole detection precision, but also obviously improves the identification capability of fine defects in a complex background, strengthens the detection capability of irregular target defects by introducing a DCN operator, and strengthens the positioning of target defects by using MPDIoU.

The embodiment of the invention also provides a chip surface defect detection system based on the improvement YOLOv, which comprises the following modules:

(4) And the training and predicting module is used for inputting the dataset into the improved YOLOv network model for training and predicting to obtain a chip image containing the surface defects.

In this embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 5. The computer device includes a processor, an internal memory, an external memory, and a camera connected by a system bus. The processor is used for providing computing and control capability, the internal memory provides an operating environment for an operating system and application programs, and the external memory is a nonvolatile storage medium for storing the operating system and the computer programs. The camera of the computer device is responsible for collecting chip images with surface defects and displaying the processed results on a display screen. The communication interface of the computer equipment and the external terminal can realize communication through a WIFI technology, and the display screen can be a liquid crystal display screen.

It should be noted that the architecture depicted in fig. 5 is merely representative of some examples of hardware architectures associated with the subject matter of this patent application, and is not intended to limit the particular type or configuration of computer device to which the invention may be applied. In fact, in the practical application process, according to different requirements, the above components can be increased or decreased, or the components can be combined in different ways, or even different layout designs can be adopted. It will thus be appreciated that many variations are possible to those skilled in the art and are not limited to the specific examples described herein.

Description of related words:

The foregoing description has only expressed preferred embodiments of the invention and is not intended to limit the invention in any way. Modifications and variations of the disclosed embodiments are possible in light of the above teachings, as would be recognized by those of skill in the art. However, any simple modification, equivalent variation and adaptation of the above embodiments according to the technical substance of the present invention, which are intended to deviate from the technical substance of the present invention, still fall within the scope of the technical solution of the present invention.

Claims

1. A chip surface defect detection method based on improved YOLOv8, characterized in that the method comprises:

S1, build a chip surface defect image dataset;

S2, divide the obtained data set into training set, validation set and test set;

S3, based on the YOLOv8 network model, DCNv4 and C2f are fused to form a new C2f-DCN variable fusion module for feature extraction, a dynamic upsampling operator Dysample is introduced, and the MPDIoU boundary loss function is used to build an improved chip surface defect detection model, where the improved chip surface defect detection model is an improved YOLOv8 model;

S4, input the chip surface defect dataset into the improved YOLov8 model for training;

S5, use the test set to test the YOLOv8 and improved YOLOv8 models.

2. The chip surface defect detection method based on improved YOLOv8 according to claim 1, characterized in that the step S1 specifically includes the steps of:

(1.1) Obtaining chip surface defect images: Obtaining chip images with defects during chip production and classifying them according to the types of chip defects;

(1.2) Image data enhancement: Use data image enhancement methods to expand the original dataset and enhance its diversity;

(1.3) Image annotation: Use the LabelImg image annotation tool to mark the defect area with a frame.

3. The chip surface defect detection method based on improved YOLOv8 according to claim 1 is characterized in that in step S2, the chip surface defect data set is divided into a training set, a verification set and a test set according to a certain ratio.

4. The chip surface defect detection method based on improved YOLOv8 according to claim 1 is characterized in that, in step S3, an improved chip surface defect detection model is constructed, comprising:

(3.1) Configure the environment: Use Anaconda to create a virtual environment and debug the source code of the YOLOv8 model;

(3.2) Input data: Input the constructed chip surface defect dataset into the YOLOv8 network structure model, including obtaining chip surface defect images, surface defect image labels, YOLOv8 network structure configuration files, and YOLOv8 model pre-training weights;

(3.3) Configure hyperparameters: set the size of the network input image, learning rate, batch size, and number of iterations;

(3.4) The original model used is YOLOv8, and the YOLOv8 network structure model is improved.

5. The chip surface defect detection method based on improved YOLOv8 according to claim 4, characterized in that, in step (3.4), the YOLOv8 network structure model is improved, including:

(3.4.1) Fusing DCN and C2f to form a new C2f-DCN variable feature fusion module for feature extraction;

DCNv4_Net uses a method similar to separable convolution to generate a mask through depth-wise calculation and calculate the offset using point-wise operation. Finally, the network can output the horizontal and vertical displacement and shape change parameters for each pixel.

(3.4.2) Modify the original upsampling module to Dysample to reduce the loss of detail feature information in the upsampling stage;

Dysample upsampling uses linear projection to generate ^s2 offset sets, constructs new sampling points, and resamples the input feature map using bilinear interpolation;

(3.4.3) The MPDIoU boundary loss function is used to replace the original loss function to enhance the localization of target defects.

6. The chip surface defect detection method based on improved YOLOv8 according to claim 5, characterized in that the step of calculating the offset and shape change parameters of the pixel points described in step (3.4.1) is:

(3.4.1.1) There are two convolution kernels in the C2f module Bottleneck of YOLOv8. The size of these two convolution kernels is 3*3. The size of the feature map output by the two convolution kernels remains unchanged. There is also a 3*3 convolution kernel in the DCNv4_Net structure, which can calculate the required offset and change parameters without changing the size of the feature map. Replace the two 3*3 convolutions in the C2f module with DCNv4_Net to obtain C2f-DCNv4;

(3.4.1.2) Input features are passed into C2f-DCNv4. The features first pass through a layer of 1*1 convolution kernel. A Split feature segmentation layer divides the features into two according to the number of channels. One part waits for feature concatenation Concat, and the other part is passed to the DCNv4 structure. In this structure, the features pass through two DCNv4 convolution layers. The DCNv4 convolution layer is a separable convolution composed of a 3*3 convolution and a linear projection. The original convolution weight wk is separated into a depth part and a point-by-point part. The depth part is responsible for the original position-aware modulation scalar mk, and the point-by-point part is the projection weight w shared between sampling points. The feature pixel _p0 is output after this separable convolution as follows:

Where G is the number of aggregation groups; for the g-th group, wg∈RC×C ^′ , C ^′ =C/G is the projection weight at different positions: K is the number of sampling points; mgk∈R is the modulation factor of the k-th sampling point, and unlike DCNv3, DCNv4 cancels the final softmax normalization, reducing unnecessary calculations and speeding up the calculation of the model;

(3.4.1.3) Perform a BatchNorm2d operation on the feature output obtained by DCNv4_Net:

Where: x is the input eigenvalue, E[x] represents the mean, Var[x] represents the variance. These two parameters can be directly calculated in forward propagation. γ and β are the adjusted variance and adjusted mean, with initial values of 1 and 0 respectively. These two parameters are continuously learned and updated by the model in back propagation. ε is to ensure the stability of the calculation and avoid the denominator being 0. The default value is 1e-5.

(3.4.1.4) Use the RELU function to perform an activation operation on the output result, outputting all input values greater than 0 and setting the input values less than 0 to 0.

7. According to the chip surface defect detection method based on improved YOLOv8 in claim 5, it is characterized in that Dysample is introduced in step (3.4.2) for upsampling, comprising: firstly giving an upsampling scale factor s and a feature map X of size C×H×W, performing bilinear interpolation on the input feature map X, using a linear layer with an input channel of C and an output channel of 2s ² to generate an offset O of size 2s ² ×H×W, controlling the offset O by a dynamic range factor, and then reshaping the offset O to 2×sH×sW by pixel transformation, adding the offset O to the original sampling grid G to output a sampling set S, resampling the feature map X by the position of the sampling set S through a grid sampling function, and generating an upsampled feature map X of size C×sH×sW.

8. The chip surface defect detection method based on improved YOLOv8 according to claim 5, characterized in that the MPDIoU boundary loss function is used to replace the original loss function in step (3.4.3), which is defined as:

Where: _Bpre represents the coordinates of the predicted bounding box, _Bgt represents the coordinates of the actual labeled bounding box, _x1 , _y1 represents the coordinates of the upper left corner of the box, _x2 , _y2 represents the coordinates of the lower right corner of the box, is the distance between the upper left corner and the lower right corner of the predicted bounding box and the actual labeled bounding box, and w and h are the width and height of the chip image.

9. A chip surface defect detection system based on improved YOLOv8, characterized by comprising the following modules:

(1) Dataset building module: used to build a chip surface defect image dataset;

(2) Dataset partitioning module: used to partition the obtained dataset into training set, validation set and test set;

(3) Model building module: used to build an improved YOLOv8 network model;

(4) Training and prediction module: used to input the data set into the improved YOLOv8 network model for training and prediction to obtain chip images containing surface defects.

10. A computer device, comprising a camera, an internal memory, an external memory and a processor, wherein the camera is responsible for collecting images and obtaining images of chips with surface defects; the internal memory provides an operating environment for an operating system and application programs; the external memory is a non-volatile storage medium, characterized in that it is used to store an operating system and a computer program, and the processor implements any one of the methods of claims 1 to 8 when executing the computer program.