Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In various fields, for actions to be performed by a user, such as transactions, investments, ratings, etc., it is necessary to identify the identity, rights, capabilities, etc. of the user by uploading an image containing a target object, typically a certificate, by using the user, it should be understood that the user may be a person or a group or enterprise, and the certificate uploaded by the user may be an identification card, a business license, a certificate, etc.
At present, for an image uploaded by a user, a certificate in the image is identified to obtain contents such as texts in the certificate, however, the quality of the image uploaded by the user often cannot meet the identification requirement, for example, the definition of the image is lower, the brightness is darker, the exposure is excessive or the certificate is incomplete, and the image which does not meet the identification requirement is directly identified, so that the accuracy of an identification result cannot be ensured.
Aiming at the problems, the embodiment of the application obtains the image characteristics of the image to be identified, determines whether the image meets the identification conditions in the imaging quality or the integrity degree of the target object based on the image characteristics, and identifies the text in the image after determining that the image meets the identification conditions to obtain text information.
As an example and not by way of limitation, while confirming whether the image to be recognized is a roll-over, copy, invalid object (e.g., incorrect document or absence of a desired document), or orientation correct, etc., text in the image is recognized when the condition of the image is confirmed to be normal and the image satisfies the recognition condition.
The technical scheme of the embodiment of the application can be applied to various electronic equipment and is used for realizing accurate identification of the image to be identified. The electronic device may be a terminal device, such as a Mobile Phone (Mobile Phone), a tablet (Pad), a computer, a Virtual Reality (VR) terminal device, an augmented Reality (Augmented Reality, AR) terminal device, a terminal device in industrial control (industrial control), a terminal device in unmanned driving (self driving), a terminal device in remote medical (remote medical), a terminal device in smart city (smart city), a terminal device in smart home (smart home), or the like. The terminal device in the embodiment of the application can also be a wearable device, and the wearable device can also be called as a wearable intelligent device, is a generic name for intelligently designing and developing wearable devices by applying a wearable technology, such as glasses, gloves, watches, clothes, shoes and the like. The wearable device is a portable device that is worn directly on the body or integrated into the clothing or accessories of the user. The terminal device may be fixed or mobile.
The electronic device in the embodiment of the application can also be a server, and when the electronic device is a server, the electronic device can receive the image collected by the terminal device and process the image to realize accurate identification of the target object.
Fig. 1 is a schematic structural diagram of an electronic device 100 according to an embodiment of the present application. As shown in fig. 1, the electronic device 100 includes: an image acquisition unit 101, an image determination unit 102, and an image recognition unit 103, the image determination unit 102 being connected to the image acquisition unit 101 and the image recognition unit 103, respectively.
Wherein the image acquisition unit 102 is configured to acquire an image to be identified, which image should contain, for example, a credential class object. By way of example, the image captured by the image capturing device may be received, or an image transmitted by another device may be received, or an image input by a user may be received, which is not a limitation of the embodiments of the present application.
The image determining unit 102 receives the image to be identified sent by the image acquiring unit 101, confirms the imaging quality of the image and/or the integrity degree of the target object in the image, and sends the image to be identified to the image identifying unit 103 when the image meets the identification condition, namely, the imaging quality of the image meets the quality condition and/or the integrity degree of the target object meets the integrity degree condition; for example, if the imaging quality of the image does not satisfy the recognition condition, or the integrity of the target object does not satisfy the recognition condition, or neither the imaging quality of the image nor the integrity of the target object satisfies the recognition condition, the image is not recognized.
In some embodiments, the electronic device 100 further comprises an information sending unit 104, the information sending unit 104 being connected to the image determining unit 102. When the image determining unit 102 determines that the image does not meet the recognition condition, an instruction is sent to the information sending unit 104, and the information sending unit 104 generates instruction information according to the instruction, where the instruction information is used to indicate that the image does not meet the requirement, and optionally, the instruction information may further indicate an item that does not meet the requirement, and prompt the user to re-provide the image to be recognized.
The image recognition unit 103, after receiving the image to be recognized, recognizes the target object in the image to be recognized to obtain information of the target object, and generally, the obtained information of the target object is text information, and in some embodiments, may also obtain image information of the target object, for example, when the target object is an identity card, obtain a personal photo in the target object.
It should be appreciated that the electronic device 100 further comprises a storage unit (not shown in the figure) for storing information of the identified target objects, for example, information of each target object is stored in a structured manner, for example, "name Li Mou" in the identity card is stored as [ name, li Mou ].
The application is illustrated in the following by means of several examples.
Fig. 2 is a flowchart of an image processing method 200 according to an embodiment of the present application.
In order to accurately identify a target object in an image, before the image is identified, the embodiment of the application firstly confirms the imaging quality and the integrity of the image, and when the image is confirmed to meet the identification condition based on the imaging quality and the integrity, the target object in the image is identified.
As shown in fig. 2, the image processing method provided by the embodiment of the application includes:
s201: at least one image feature of an image to be identified is acquired.
It should be appreciated that the image to be identified includes a target object, which may alternatively be a certificate class object, such as an identity card, business license, or certificate, or may be a document class object.
Wherein the image features are used to characterize the imaging quality of the image or the integrity of the target object in the image.
The image features may comprise at least one first feature for characterizing the imaging quality of the image and/or a second feature for characterizing the integrity of the target object. Correspondingly, acquiring at least one image feature of the image to be identified comprises: at least one first feature of the image is acquired and/or a second feature of the image is acquired. It will be appreciated that only at least one first feature of the image or only a second feature of the image may be acquired, or that at least one first feature and second feature of the image may be acquired separately.
Alternatively, the first feature may be any one of a variance feature, a mean feature, a first pixel number feature, or a second pixel number feature. The first pixel number is the number of adjacent pixels with pixel values larger than a first preset pixel value, and the second pixel number is the number of adjacent pixels with pixel values smaller than a second preset pixel value.
S202: based on the at least one image feature, it is determined whether the image satisfies the recognition condition.
In this step, it is determined whether the image satisfies the recognition condition in combination with the acquired at least one image feature. For example, based on the variance feature, determining whether the sharpness of the image meets the requirement, and determining that the image meets the recognition condition when the sharpness of the image meets the requirement; for another example, based on the variance feature, it is determined whether the sharpness of the image satisfies the requirement, and based on the mean value, it is determined whether the brightness of the image satisfies the requirement, and when the sharpness satisfies the requirement and the brightness satisfies the requirement, it is determined that the image satisfies the recognition condition.
S203: and when the image meets the identification condition, identifying the target object in the image to obtain the information of the target object.
If the image meets the recognition condition, the image is indicated to be easy to be recognized accurately, namely, the image meeting the recognition condition is recognized, and a more accurate recognition result is obtained. Further, the information of the target object is obtained by identifying the target object in the image, and generally, the information of the target object includes text information, such as a name, an address, a date of birth, etc. in the identification card, and in some embodiments, the target object further includes image information, such as a personal photo in the identification card.
Illustratively, the information of the target object is stored for subsequent querying or use.
The embodiment of the application confirms whether the image meets the identification condition in the imaging quality and/or the integrity degree of the target object based on at least one image feature of the acquired image to be identified, and identifies the target object in the image when the image meets the identification condition so as to obtain an accurate identification result.
Fig. 3 is a flowchart of an image processing method 300 according to an embodiment of the present application.
In order to accurately confirm whether the image meets the recognition condition, the embodiment of the present application proposes an implementation manner as shown in fig. 3, so as to determine whether the image meets the recognition condition, including:
S301: and aiming at each image feature in the at least one image feature, based on the image feature and a corresponding threshold value, obtaining an evaluation result of whether the image feature meets a corresponding preset condition.
For example, if the image feature is the variance of the image, when the variance is greater than the sharpness threshold, it is determined that the image feature meets a corresponding preset condition, and it is understood that the variance of the image is the variance calculated based on the pixel value of each pixel in the image, the greater the variance of the image, the wider the frequency response range of the image, which indicates that the image is focused accurately, i.e., the higher the sharpness of the image, and the smaller the variance of the image, the narrower the frequency response range of the image, which indicates that the number of edges of the image is small, i.e., the lower the sharpness of the image. The definition threshold is a preset variance value meeting the definition requirement.
If the image feature is the average value of the image, when the average value is greater than the brightness threshold value, determining that the image feature meets a corresponding preset condition, wherein the average value of the image is calculated based on the pixel value of each pixel point in the image, and the larger the average value of the image is, the higher the brightness of the image is, and the smaller the average value of the image is, and the lower the brightness of the image is. The brightness threshold is a value corresponding to the average value when the brightness requirement is met.
If the image feature is the first pixel number, determining that the image feature meets a corresponding preset condition when the first pixel number is smaller than a first number threshold, wherein the first pixel number is the number of adjacent pixels with pixel values larger than a first preset pixel value. First, the number of adjacent pixels in the image, i.e. the first number of pixels, which is larger than a first preset pixel value is determined, and when the first number of pixels is smaller than a first number threshold, it is indicated that there is no bright spot, or so-called flare, in the image.
If the image feature is the second pixel number, determining that the image feature meets a corresponding preset condition when the second pixel number is smaller than a second number threshold, wherein the second pixel number is the number of adjacent pixels with pixel values smaller than a second preset pixel value. First, the number of vector pixels in the image that are smaller than the second preset pixel value, i.e. the second number of pixels, is determined, and when the second number of pixels is smaller than the second number threshold, it is indicated that there is no shadow, or so-called shadow, in the image.
It should be noted that the first preset pixel value is greater than the second preset pixel value; the first number threshold and the second number threshold may be the same or different, and the present application is not limited in this regard.
If the image features are the intersection ratio, determining that the image features meet corresponding preset conditions when the intersection ratio is larger than an intersection ratio threshold, wherein the intersection ratio is a ratio of an intersection set and a union set of a foreground image and a background image obtained by image segmentation of the image, the foreground image comprises a target object, the background image does not comprise the target object, and generally, the foreground object only comprises the target object. It should be understood that when the blending ratio is 1, the target object in the image is complete, when the blending ratio is less than 1, the situation of edge deficiency, corner deficiency or shielding exists in the target object in the image, and when the blending ratio is smaller, the situation of edge deficiency, corner deficiency or shielding of the target object is more serious, and the blending ratio threshold value is used for distinguishing whether the target object meets the preset condition according to the acceptable incompleteness degree of the target object.
S302: based on the evaluation result of each image feature, it is determined whether the image satisfies the recognition condition.
In an actual application scene, weighting operation can be carried out on the evaluation result of each image characteristic, and whether the image meets the identification condition or not is determined; or when more than half of the evaluation results indicate that the image features meet the corresponding preset conditions, determining that the image meets the identification conditions, otherwise, not meeting the identification conditions; or when the evaluation result of each image feature indicates that the corresponding image feature meets the corresponding preset condition, determining that the image meets the identification condition, and when any evaluation result indicates that the image feature does not meet the corresponding preset condition, determining that the image does not meet the identification condition.
Fig. 4 is a flowchart of an image processing method 400 according to an embodiment of the present application.
On the basis of any of the above embodiments, an embodiment of the present application will be described with reference to fig. 4, which illustrates how to acquire at least one first feature of an image.
As shown in fig. 4, the method includes:
s401: the image is converted into a gray scale image.
Generally, the image to be identified is a color image, such as an RGB image, and in this step, the color image needs to be converted into a gray image by color control conversion. Optionally, the pixel value of each pixel point in the gray scale image is between 0 and 255.
S402: at least one first feature of the image is determined based on the gray scale image.
At least one first feature of the image, such as the variance of the image, the mean of the image, the first number of pixels or the second number of pixels, etc., is derived based on the pixel value of each pixel in the gray scale image.
In connection with fig. 5, a possible implementation is provided for determining at least one first feature of the image based on the gray scale image.
S501: the gray scale image is converted into a laplace image by the laplace algorithm.
It will be appreciated that laplace is a differential operator whose application enhances areas of abrupt gray scale changes in gray scale images, weakening areas of slow gray scale changes.
In this step, the gray-scale image is converted into a laplace image by the laplace algorithm, and an operation can be performed based on an arbitrary laplace operator.
Illustratively, a convolution operation is performed on the gray-scale image through a preset Laplace mask, so as to obtain a Laplace image.
The laplace mask is a predetermined convolution template, and preferably, the laplace mask may be set to a 3 by 3 mask as shown in table 1.
TABLE 1
S502: at least one first feature of the image is determined based on the laplace image.
Illustratively, at least one of a variance, a mean, a first number of pixels, or a second number of pixels of the laplace image is calculated based on pixel values of each pixel point in the laplace image.
The first pixel number is the number of adjacent pixels with pixel values larger than a first preset pixel value, and the second pixel number is the number of adjacent pixels with pixel values smaller than a second preset pixel value, and the first preset pixel value is larger than the second preset pixel value.
Fig. 6 is a flowchart of an image processing method 600 according to an embodiment of the present application.
On the basis of any of the above embodiments, an embodiment of the present application will be described with reference to fig. 6, which illustrates how to acquire the second feature of the image.
As shown in fig. 6, the method includes:
s601: and (3) image segmentation is carried out on the image through a segmentation algorithm, so that a foreground image and a background image are obtained.
In this step, the image is segmented by a segmentation algorithm, for example, a GrabCut segmentation algorithm is used to segment the image to obtain a foreground image containing the target object and a background image not containing the target object.
S602: based on the foreground image and the preset image, an Intersection-over-Union (IoU) of the foreground image and the preset image is calculated.
The preset image is an image having the same aspect ratio as the target object. As an example, when the user performs image acquisition on the target object, a preset image or an outline of the preset image is displayed in a view-finding frame or a preview frame, so that the user can acquire an image containing the target object based on the preset image acquisition; as another example, after the image to be identified is acquired, the target object in the image is calibrated according to the preset image, for example, the center point of the target object is aligned with the center point of the preset image, and the target object is scaled to the size of the preset image.
In this step, the intersection of the foreground image and the preset image, i.e. the area of the quadrilateral ABCD in fig. 7a, is compared with the union of the foreground image and the preset image, i.e. the area of the irregular polygon EFBGHD in fig. 7a, to obtain the intersection ratio. The intersection ratio is used for representing the ratio of intersection and union of the foreground image and the preset image. For example, as shown in fig. 7B, the quadrangle a 'B' C 'D' is an occluded area, and the area belongs to the background image, and the intersection ratio is the area of the quadrangle ABCD minus the area of the quadrangle a 'B' C 'D' and the area of the irregular polygon EFBGHD.
It should be appreciated that the second feature includes the cross-over ratio.
Further, when the intersection ratio is larger than the intersection ratio threshold value, determining that the image feature meets a corresponding preset condition.
In this embodiment, whether the target object in the image is complete is determined by calculating the intersection ratio, so as to prevent the target object from having edges and corners, or being blocked, so as to determine that the image meets the recognition condition.
Fig. 8 is a flowchart of an image processing method 800 according to an embodiment of the present application.
On the basis of any of the above embodiments, with reference to fig. 8, the image processing method further includes:
s801: and inputting the image to be identified into an image classification model to obtain an image classification result.
The image classification model is obtained based on training of a first network model, for example, an acceptance series network model, and preferably, an acceptance v3 can be used as a backbone network.
The classification result is used for representing that the state of the target object is a normal state or an abnormal state, and optionally, the abnormal state at least comprises one of a flap, a copy or an invalid object, and the invalid object comprises a temporary identity card of the target object, a non-identity card of the target object or the like by taking the identity card of the target object as an example. Optionally, the abnormal state further includes whether the front and back sides of the target object meet the requirement, for example, when the front side of the identity card needs to be uploaded, if the target object in the image is the back side of the identity card, the abnormal state is the front side of the identity card, and if the target object is the front side of the identity card, the normal state is the front side of the identity card.
For example, the image to be identified may be input into an image classification model, or the image to be identified may be preprocessed, and the processed image may be input into an image classification model, for example, the image to be identified may be converted into a grayscale image.
S802: and when the classification result indicates that the state of the target object is a normal state and the image meets the identification condition, identifying the target object in the image to obtain the information of the target object.
It should be noted that, based on the classification result, the execution order of determining whether the state of the image is normal and determining whether the image satisfies the recognition condition is not limited, that is, based on the classification result, determining whether the state of the image is normal may be performed before or after determining whether the image satisfies the recognition condition or may be performed simultaneously with determining whether the image satisfies the recognition condition.
In this embodiment, the classification judgment is further performed on the target object in the image, and when the target object is determined to be in the normal state, the target object in the image is identified, so that the error identification is prevented, and the acquired information of the target object is prevented from being wrong.
Fig. 9 is a flowchart of an image processing method 900 according to an embodiment of the present application.
On the basis of any of the above embodiments, with reference to fig. 9, when the image meets the recognition condition, how to recognize the target object in the image to obtain the information of the target object provides the following implementation manner:
s901: at least one text line image of the target object is acquired.
In this step, the text line of the target object in the image is identified, and at least one text line image is obtained.
For example, an edge detection may be performed on each text line of the target object by using any image segmentation algorithm, and a binarization mask of the text line may be extracted by morphological operation (also referred to as an open-close operation) in combination with a connected domain, so as to obtain a text line image.
S902: and inputting at least one text line image into the text recognition model to obtain the text information of the target object.
The text recognition model is trained based on the second network model.
Optionally, the second network model is a network model based on a convolutional neural network CNN model and a join-sense time-classified CTC setting. Compared with the traditional network model containing the cyclic neural network RNN, the method has the advantages that the content of the text line can be accurately identified, the identification speed of the content of the text line is improved, CTC is used for solving the problem that the length of the customer service output sequence is inconsistent with that of the input sequence, new sequences are formed by supplementing blank spaces in the middle, and the blank spaces are removed by using a certain rule during decoding. Alternatively, the backbone network of the second network model may employ a network of the DenseNet algorithm.
It should be understood that the text recognition model in this step outputs the text information of the target object as structured information, i.e., in the form of [ key, value ], for each text line image.
On the basis of the embodiment shown in fig. 9, as an example, the image to be recognized also needs to be preprocessed before the at least one text line image of the target object is acquired.
For example, in the process of image capturing of a target object, a certain relative angle exists between an image capturing device, such as a lens of a camera, and the target object, so that the target object has a certain degree of distortion, and in conjunction with the illustration of fig. 10, the target object is a quadrilateral represented by 1234 four vertices in fig. 10- (a), and the quadrilateral is a trapezoid, so that the quadrilateral needs to be converted into a regular quadrilateral represented by 1234 four vertices in fig. 10- (b).
For example, an image of a target object may be obtained from an image to be identified based on an arbitrary edge detection algorithm, for example, the target object is subjected to edge detection by an image segmentation algorithm, a binarized boundary (also referred to as a binarized mask) of the target object is extracted based on morphological operation (also referred to as an opening and closing operation) in combination with connected domain analysis, a maximum bounding rectangle is obtained based on the binarized boundary, and a possibility of false detection is eliminated in combination with an area ratio of a target object area and an area of the image to be identified, for example, when the area ratio is smaller than an area threshold, false detection is considered to exist, that is, the image of the identified target object is incorrect. Further, four vertexes of the quadrangle are obtained through connected domain analysis, a target object in the range of the converted regular quadrangle is obtained through perspective transformation based on the position coordinates of the four vertexes, and the information of the target object is obtained by identification based on an image containing the target object.
Fig. 11 is a flowchart of an image processing method 1100 according to an embodiment of the present application.
On the basis of any one of the foregoing embodiments, this embodiment provides a possible implementation manner, which specifically includes:
firstly, image acquisition is carried out, in an application scene, when a user carries out image acquisition on a target object through a handheld image acquisition device, an electronic device provides an image preview frame for the user through a display device, in some embodiments, the outline of a preset image is displayed in the image preview frame, the preset image has the same transverse-longitudinal ratio with the target object to be acquired, and the user can align the target object with the outline of the preset image and place the target object in the outline of the preset image to acquire the target object.
Further, performing imaging quality evaluation on the image to be identified containing the target object, for example, determining whether the definition and brightness of the image to be identified meet the requirements of preset conditions, and whether bright spots or shadows exist, after the imaging quality evaluation, continuing to perform integrity detection if the image is qualified, and re-performing image acquisition if the image is unqualified.
And detecting the integrity of the image to be identified, which is evaluated to be qualified in imaging quality, determining whether the object has the conditions of edge deficiency, corner deficiency or shielding, if the detection result of the integrity is qualified, evaluating the risk type of the next step, and if the detection result is unqualified, re-acquiring the image.
And carrying out risk type evaluation on the image to be identified through the image classification model obtained through pre-training to obtain a classification result, detecting the target object when the classification result indicates that the state of the target object is in a normal state, and carrying out image acquisition again when the classification result indicates that the state of the target object is in an abnormal state.
When the evaluation and detection of the image to be identified pass through the above process, the image is indicated to be easily and accurately identified, and then, the embodiment detects the target object in the image, for example, the image of the target object is obtained through an image segmentation algorithm, further, the text line detection is performed on the image of the target object, for example, the image segmentation algorithm is used for obtaining at least one text line image, then, the at least one text line image is input into a text identification model obtained through pre-training, and the text identification model outputs structured text information.
Fig. 12 is a schematic structural diagram of an electronic device 1200 according to an embodiment of the present application, as shown in fig. 12, the electronic device 1200 includes:
an acquisition unit 1210 for acquiring at least one image feature of an image to be identified, the image comprising a target object, the image feature being used to characterize the imaging quality of the image or the integrity of the target object;
A processing unit 1220 configured to determine, based on the at least one image feature, whether the image satisfies the recognition condition;
the processing unit 1220 is further configured to identify a target object in the image when the image meets the identification condition, and obtain information of the target object.
The electronic device 1200 provided in this embodiment includes an acquisition unit 1210 and a processing unit 1220, and determines, based on at least one image feature of an acquired image to be identified, whether the image satisfies an identification condition in terms of imaging quality and/or integrity of a target object, and identifies the target object in the image when the image satisfies the identification condition, so as to obtain an accurate identification result.
In one possible design, the acquisition unit 1210 is specifically configured to:
acquiring at least one first feature of the image, the first feature being used to characterize the imaging quality of the image;
and/or the number of the groups of groups,
a second feature of the image is acquired, the second feature being used to characterize the integrity of the target object.
In one possible design, processing unit 1220 may be specifically configured to:
aiming at each image feature in at least one image feature, based on the image feature and a corresponding threshold value, obtaining an evaluation result of whether the image feature meets a corresponding preset condition;
Based on the evaluation result of each image feature, it is determined whether the image satisfies the recognition condition.
In one possible design, processing unit 1220 may be specifically configured to:
and when the evaluation result of each image feature indicates that the image feature meets the corresponding preset condition, determining that the image meets the identification condition.
In one possible design, the acquisition unit 1210 is specifically configured to:
converting the image into a gray scale image;
at least one first feature of the image is determined based on the gray scale image.
In one possible design, the acquisition unit 1210 is specifically configured to:
converting the gray level image into a Laplacian image through a Laplacian algorithm;
at least one first feature of the image is determined based on the laplace image.
In one possible design, the acquisition unit 1210 is specifically configured to:
and carrying out convolution operation on the gray level image through a preset Laplace mask to obtain a Laplace image.
In one possible design, the acquisition unit 1210 is specifically configured to:
calculating to obtain at least one of variance, mean value, first pixel number or second pixel number of the Laplace image based on pixel values of all pixel points in the Laplace image;
the first pixel number is the number of adjacent pixels with pixel values larger than a first preset pixel value, and the second pixel number is the number of adjacent pixels with pixel values smaller than a second preset pixel value, and the first preset pixel value is larger than the second preset pixel value.
In one possible design, processing unit 1220 may be specifically configured to:
if the image features are variances of the images, determining that the image features meet corresponding preset conditions when the variances are larger than a definition threshold;
if the image features are the average value of the image, determining that the image features meet corresponding preset conditions when the average value is larger than a brightness threshold value;
if the image features are the first pixel number, determining that the image features meet corresponding preset conditions when the first pixel number is smaller than a first number threshold, wherein the first pixel number is the number of adjacent pixels with pixel values larger than a first preset pixel value;
if the image feature is the second pixel number, determining that the image feature meets the corresponding preset condition when the second pixel number is smaller than a second number threshold, wherein the second pixel number is the number of adjacent pixels with the pixel value smaller than a second preset pixel value.
In one possible design, the acquisition unit 1210 is specifically configured to:
image segmentation is carried out on the image through a segmentation algorithm to obtain a foreground image and a background image, wherein the foreground image comprises a target object, and the background image does not comprise the target object;
and calculating the intersection ratio of the foreground image and the preset image based on the foreground image and the preset image, wherein the intersection ratio is used for representing the ratio of the intersection of the foreground image and the preset image to the union.
In one possible design, processing unit 1220 may be specifically configured to:
when the intersection ratio is larger than the intersection ratio threshold value, determining that the image features meet corresponding preset conditions, wherein the intersection ratio is used for representing the ratio of intersection and union of a foreground image and a preset image, and the foreground image is an image containing a target object obtained by image segmentation of the image.
In one possible design, the acquisition unit 1210 is further configured to: inputting an image to be identified into an image classification model to obtain an image classification result, wherein the image classification model is obtained based on training of a first network model, the classification result is used for representing whether the state of a target object is a normal state or an abnormal state, and the abnormal state at least comprises one of a flap, a copy or an invalid object;
the processing unit 1220 is further configured to perform, when the classification result indicates that the state of the target object is a normal state, a step of identifying the target object in the image when the image satisfies the identification condition, and obtaining information of the target object.
In one possible design, the processing unit is specifically configured to:
acquiring at least one text line image of a target object;
and inputting at least one text line image into a text recognition model to obtain text information of the target object, wherein the text recognition model is trained based on the second network model.
The electronic device provided in this embodiment may be used to implement the method in any of the foregoing embodiments, and the implementation effect is similar to that of the method embodiment, and will not be described herein.
Fig. 13 is a schematic hardware structure of an electronic device 1300 according to an embodiment of the application. As shown in fig. 13, in general, an electronic device 1300 includes: a processor 1310 and a memory 1320.
Processor 1310 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 1310 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 901 may also include a main processor and a coprocessor, the main processor being a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1310 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 1310 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 1320 may include one or more computer-readable storage media, which may be non-transitory. Memory 1320 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1320 is used to store at least one instruction for execution by processor 1310 to implement the methods provided by the method embodiments of the present application.
Optionally, as shown in fig. 10, the electronic device 1300 may further include a transceiver 1330, and the processor 1310 may control the transceiver 1330 to communicate with other devices, and in particular, may send information or data to other devices, or receive information or data sent by other devices.
The transceiver 1330 may include, among other things, a transmitter and a receiver. The transceiver 1330 may further include antennas, the number of which may be one or more.
Optionally, the electronic device 1300 may implement corresponding flows in the methods according to the embodiments of the present application, which are not described herein for brevity.
Those skilled in the art will appreciate that the structure shown in fig. 10 is not limiting of the electronic device 1300 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
Embodiments of the present application also provide a non-transitory computer-readable storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform the method provided by the above embodiments.
The computer readable storage medium in this embodiment may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, etc. that contains one or more available medium(s) integrated, and the available medium may be a magnetic medium, (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., an SSD), etc.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
The embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method provided by the above embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.