CN115965989A - Document identification method, device, electronic device and storage medium - Google Patents
Document identification method, device, electronic device and storage medium Download PDFInfo
- Publication number
- CN115965989A CN115965989A CN202211729300.4A CN202211729300A CN115965989A CN 115965989 A CN115965989 A CN 115965989A CN 202211729300 A CN202211729300 A CN 202211729300A CN 115965989 A CN115965989 A CN 115965989A
- Authority
- CN
- China
- Prior art keywords
- document
- rectangle
- target
- preview image
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Studio Devices (AREA)
Abstract
Description
技术领域technical field
本申请涉及图像处理技术领域,尤其涉及一种文档识别方法、装置、电子设备及存储介质。The present application relates to the technical field of image processing, and in particular to a document recognition method, device, electronic equipment and storage medium.
背景技术Background technique
现有相机功能中,文档识别功能的重要程度较高,使用次数也较多,尤其是对于学生来说,文档识别功能更为的重要。但现有的文档识别功能在相机预览时无法进行文档识别,或是需要电子设备相当多的性能才能够实时完成对预览画面的文档识别。Among the existing camera functions, the document recognition function is more important and used more frequently, especially for students, the document recognition function is more important. However, the existing document recognition function cannot perform document recognition when the camera is previewing, or requires a considerable amount of performance of the electronic device to be able to complete the document recognition on the preview screen in real time.
发明内容Contents of the invention
本申请实施例公开了一种文档识别方法、装置、电子设备及存储介质,能够降低文档识别所占用的性能。The embodiment of the present application discloses a document identification method, device, electronic equipment and storage medium, which can reduce the performance occupied by document identification.
本申请实施例公开了一种文档识别方法,所述方法包括:The embodiment of the present application discloses a document identification method, the method includes:
当摄像头工作时,获取预览画面;When the camera is working, get a preview image;
根据第一文档识别算法,检测所述预览画面中是否存在第一文档矩形;According to the first document recognition algorithm, detecting whether there is a first document rectangle in the preview screen;
在所述预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形;There is a first document rectangle in the preview screen and when a shooting instruction is received, according to a second document recognition algorithm, detecting a second document rectangle in the captured image;
在所述拍摄图像中叠加显示所述第二文档矩形;superimposing and displaying the second document rectangle in the captured image;
其中,所述第一文档识别算法的运算量小于所述第二文档识别算法的运算量,所述第一文档识别算法的准确度低于所述第二文档识别算法的准确度。Wherein, the calculation amount of the first document recognition algorithm is smaller than that of the second document recognition algorithm, and the accuracy of the first document recognition algorithm is lower than the accuracy of the second document recognition algorithm.
在一个实施例中,所述预览画面包括多帧预览图像,所述根据第一文档识别算法,检测所述预览画面中是否存在第一文档矩形,包括:In one embodiment, the preview image includes a multi-frame preview image, and the detecting whether there is a first document rectangle in the preview image according to the first document recognition algorithm includes:
确定目标预览图像中灰度值满足预设条件的文档像素点;其中,所述目标预览图像为所述多帧预览图像中的任一帧预览图像;Determining document pixels whose grayscale values satisfy preset conditions in the target preview image; wherein, the target preview image is any frame preview image in the multi-frame preview images;
根据所述文档像素点在所述目标预览图像中的密集程度,在所述目标预览图像中确定是否存在第一文档矩形;determining whether a first document rectangle exists in the target preview image according to the denseness of the document pixels in the target preview image;
在所述目标预览图像中确定存在第一文档矩形的情况下,所述方法还包括:In the case where it is determined that the first document rectangle exists in the target preview image, the method further includes:
在所述目标预览图像中叠加显示所述第一文档矩形。The first document rectangle is superimposed and displayed in the target preview image.
在一个实施例中,所述确定目标预览图像中灰度值满足预设条件的文档像素点,包括:In one embodiment, the determination of document pixels whose grayscale values satisfy preset conditions in the target preview image includes:
计算所述目标预览图像中所有像素点的平均灰度值;Calculate the average gray value of all pixels in the target preview image;
将各个所述像素点的灰度值减去所述平均灰度值,得到各个所述像素点对应的计算结果;Subtracting the average gray value from the gray value of each pixel to obtain a calculation result corresponding to each pixel;
将对应的计算结果大于预设阈值的像素点,作为灰度值满足预设条件的文档像素点。The pixels whose corresponding calculation results are greater than the preset threshold are taken as the document pixels whose gray value meets the preset condition.
在一个实施例中,在所述目标预览图像中存在所述第一文档矩形的情况下,所述根据所述文档像素点在所述目标预览图像中的密集程度,在所述目标预览图像中确定是否存在第一文档矩形,包括:In one embodiment, when the first document rectangle exists in the target preview image, according to the density of the document pixels in the target preview image, in the target preview image Determine whether a first document rectangle exists, including:
根据所述文档像素点在所述目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形;determining at least one document rectangle in the target preview image according to the denseness of the document pixels in the target preview image;
确定各个所述文档矩形的对应的文档像素点密度、色差对比度以及置信度;Determining the corresponding document pixel density, color contrast and confidence of each document rectangle;
根据各个所述文档矩形的对应的文档像素点密度、色差对比度以及置信度,加权计算各个所述文档矩形的目标结果;According to the corresponding document pixel point density, color difference contrast and confidence of each said document rectangle, calculate the target result of each said document rectangle by weighting;
将所述多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形。The document rectangle with the largest target result among the plurality of document rectangles is used as the first document rectangle.
在一个实施例中,所述确定各个所述文档矩形的对应的文档像素点密度、色差对比度以及置信度,包括:In one embodiment, the determining the corresponding document pixel density, color contrast and confidence of each document rectangle includes:
根据各个所述文档矩形中的文档像素点数量,计算各个所述文档矩形对应的文档像素点密度;Calculate the document pixel density corresponding to each of the document rectangles according to the number of document pixels in each of the document rectangles;
根据各个所述文档矩形内像素点的颜色值和各个所述文档矩形外像素点的颜色值,计算各个所述文档矩形对应的色差对比度;calculating the color difference contrast corresponding to each of the document rectangles according to the color values of the pixels in each of the document rectangles and the color values of the pixels outside of each of the document rectangles;
根据各个所述文档矩形的文字信息以及灰度信息,得到各个所述文档矩形对应的置信度。According to the text information and grayscale information of each of the document rectangles, the confidence corresponding to each of the document rectangles is obtained.
在一个实施例中,所述根据所述文档像素点在所述目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形,包括:In one embodiment, the determining at least one document rectangle in the target preview image according to the density of the document pixels in the target preview image includes:
通过密度识别模型对所述文档像素点在所述目标预览图像中的密集程度进行分析,识别到文档像素点密度大于密度阈值的至少一个模型识别区域;Analyzing the denseness of the document pixels in the target preview image by means of a density recognition model, identifying at least one model recognition region whose document pixel density is greater than a density threshold;
在所述目标预览图像中确定与所述至少一个模型识别区域一一对应的至少一个文档矩形。At least one document rectangle corresponding to the at least one model recognition area is determined in the target preview image.
在一个实施例中,所述在所述目标预览图像中确定与所述至少一个模型识别区域一一对应的至少一个文档矩形,包括:In one embodiment, the determining at least one document rectangle corresponding to the at least one model recognition area in the target preview image includes:
根据所述目标模型识别区域中最左侧的像素点,确定所述目标模型识别区域对应的文档矩形的左边界;其中,所述目标模型识别区域为所述至少一个模型识别区域中任一模型识别区域;According to the leftmost pixel in the target model recognition area, determine the left boundary of the document rectangle corresponding to the target model recognition area; wherein, the target model recognition area is any model in the at least one model recognition area identification area;
根据所述目标模型识别区域中最右侧的像素点,确定所述目标模型识别区域对应的文档矩形的右边界;Determine the right boundary of the document rectangle corresponding to the target model recognition region according to the rightmost pixel in the target model recognition region;
根据所述目标模型识别区域中最上侧的像素点,确定所述目标模型识别区域对应的文档矩形的上边界;Determine the upper boundary of the document rectangle corresponding to the target model recognition region according to the uppermost pixel in the target model recognition region;
根据所述目标模型识别区域中最下侧的像素点,确定所述目标模型识别区域对应的文档矩形的下边界;Determine the lower boundary of the document rectangle corresponding to the target model recognition region according to the lowermost pixel in the target model recognition region;
根据所述左边界、所述右边界、所述上边界以及所述下边界,确定所述目标模型识别区域对应的文档矩形。A document rectangle corresponding to the target model recognition area is determined according to the left boundary, the right boundary, the upper boundary, and the lower boundary.
本申请实施例公开了一种文档识别装置,所述装置包括:The embodiment of the present application discloses a document identification device, the device includes:
获取模块,用于当摄像头工作时,获取预览画面;The obtaining module is used to obtain the preview image when the camera is working;
第一检测模块,用于根据第一文档识别算法,检测所述预览画面中是否存在第一文档矩形;A first detection module, configured to detect whether a first document rectangle exists in the preview image according to a first document recognition algorithm;
第二检测模块,用于在所述预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形;The second detection module is used to detect the second document rectangle in the captured image according to the second document recognition algorithm when the first document rectangle exists in the preview image and the shooting instruction is received;
显示模块,用于在所述拍摄图像中叠加显示所述第二文档矩形;a display module, configured to superimpose and display the second document rectangle in the captured image;
其中,所述第一文档识别算法的运算量小于所述第二文档识别算法的运算量,所述第一文档识别算法的准确度低于所述第二文档识别算法的准确度。Wherein, the calculation amount of the first document recognition algorithm is smaller than that of the second document recognition algorithm, and the accuracy of the first document recognition algorithm is lower than the accuracy of the second document recognition algorithm.
在一个实施例中,所述预览画面包括多帧预览图像,第一检测模块,还用于确定目标预览图像中灰度值满足预设条件的文档像素点;其中,所述目标预览图像为所述多帧预览图像中的任一帧预览图像;根据所述文档像素点在所述目标预览图像中的密集程度,在所述目标预览图像中确定是否存在第一文档矩形。In one embodiment, the preview image includes multiple frames of preview images, and the first detection module is further configured to determine document pixels whose grayscale values in the target preview image meet preset conditions; wherein, the target preview image is the Any frame of the preview image in the plurality of frames of preview images; according to the density of the document pixels in the target preview image, determine whether there is a first document rectangle in the target preview image.
显示模块,还用于在所述目标预览图像中叠加显示所述第一文档矩形。The display module is further configured to superimpose and display the first document rectangle in the target preview image.
在一个实施例中,第一检测模块,还用于计算所述目标预览图像中所有像素点的平均灰度值;将各个所述像素点的灰度值减去所述平均灰度值,得到各个所述像素点对应的计算结果;将对应的计算结果大于预设阈值的像素点,作为灰度值满足预设条件的文档像素点。In one embodiment, the first detection module is further configured to calculate the average gray value of all pixels in the target preview image; subtract the average gray value from the gray value of each pixel to obtain Calculation results corresponding to each of the pixels; pixels whose corresponding calculation results are greater than a preset threshold are used as document pixels whose grayscale value satisfies a preset condition.
在一个实施例中,在所述目标预览图像中存在所述第一文档矩形的情况下,第一检测模块,还用于根据所述文档像素点在所述目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形;确定各个所述文档矩形的对应的文档像素点密度、色差对比度以及置信度;根据各个所述文档矩形的对应的文档像素点密度、色差对比度以及置信度,加权计算各个所述文档矩形的目标结果;将所述多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形。In one embodiment, in the case that the first document rectangle exists in the target preview image, the first detection module is further configured to determine according to the denseness of the document pixels in the target preview image At least one document rectangle in the target preview image; determining the corresponding document pixel density, color difference contrast and confidence of each of the document rectangles; according to the corresponding document pixel density, color difference contrast and confidence of each of the document rectangles, Weighted calculation of the target result of each of the document rectangles; taking the document rectangle with the largest target result among the plurality of document rectangles as the first document rectangle.
在一个实施例中,第一检测模块,还用于根据各个所述文档矩形中的文档像素点数量,计算各个所述文档矩形对应的文档像素点密度;根据各个所述文档矩形内像素点的颜色值和各个所述文档矩形外像素点的颜色值,计算各个所述文档矩形对应的色差对比度;根据各个所述文档矩形的文字信息以及灰度信息,得到各个所述文档矩形对应的置信度。In one embodiment, the first detection module is further configured to calculate the document pixel density corresponding to each document rectangle according to the number of document pixels in each document rectangle; according to the number of pixels in each document rectangle Calculate the color difference contrast corresponding to each of the document rectangles based on the color value and the color value of each pixel outside the document rectangle; obtain the confidence level corresponding to each of the document rectangles according to the text information and grayscale information of each of the document rectangles .
在一个实施例中,第一检测模块,还用于通过密度识别模型对所述文档像素点在所述目标预览图像中的密集程度进行分析,识别到文档像素点密度大于密度阈值的至少一个模型识别区域;在所述目标预览图像中确定与所述至少一个模型识别区域一一对应的至少一个文档矩形。In one embodiment, the first detection module is further configured to analyze the denseness of the document pixels in the target preview image through a density identification model, and identify at least one model whose document pixel density is greater than a density threshold Identification area: determining at least one document rectangle corresponding to the at least one model identification area in the target preview image.
在一个实施例中,第一检测模块,还用于根据所述目标模型识别区域中最左侧的像素点,确定所述目标模型识别区域对应的文档矩形的左边界;其中,所述目标模型识别区域为所述至少一个模型识别区域中任一模型识别区域;根据所述目标模型识别区域中最右侧的像素点,确定所述目标模型识别区域对应的文档矩形的右边界;根据所述目标模型识别区域中最上侧的像素点,确定所述目标模型识别区域对应的文档矩形的上边界;根据所述目标模型识别区域中最下侧的像素点,确定所述目标模型识别区域对应的文档矩形的下边界;根据所述左边界、所述右边界、所述上边界以及所述下边界,确定所述目标模型识别区域对应的文档矩形。In one embodiment, the first detection module is further configured to determine the left boundary of the document rectangle corresponding to the target model recognition area according to the leftmost pixel in the target model recognition area; wherein, the target model The recognition area is any model recognition area in the at least one model recognition area; according to the rightmost pixel in the target model recognition area, determine the right boundary of the document rectangle corresponding to the target model recognition area; according to the The uppermost pixel in the target model recognition area, determine the upper boundary of the document rectangle corresponding to the target model recognition area; according to the lowermost pixel in the target model recognition area, determine the corresponding The lower boundary of the document rectangle; according to the left boundary, the right boundary, the upper boundary and the lower boundary, determine the document rectangle corresponding to the target model recognition area.
本申请实施例公开了一种终端设备,包括:The embodiment of this application discloses a terminal device, including:
存储有可执行程序代码的存储器;a memory storing executable program code;
与所述存储器耦合的处理器;a processor coupled to the memory;
所述处理器调用所述存储器中存储的所述可执行程序代码,执行上述任一实施例所述的方法。The processor invokes the executable program code stored in the memory to execute the method described in any of the foregoing embodiments.
本申请实施例公开了一种计算机可读存储介质,所述计算机可读存储介质存储计算机程序,其中,所述计算机程序在被处理器执行时,使得所述处理器执行上述任一实施例所述的方法。The embodiment of the present application discloses a computer-readable storage medium, where the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the processor executes any of the above-mentioned embodiments. described method.
通过本申请实施例公开的文档识别方法、装置、电子设备及存储介质,电子设备可以当摄像头工作时,获取预览画面,根据第一文档识别算法,检测所述预览画面中是否存在第一文档矩形,在所述预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形,在所述拍摄图像中叠加显示所述第二文档矩形,电子设备在预览画面中使用的所述第一文档识别算法的运算量小于在拍摄图像中所述第二文档识别算法的运算量,使得电子设备能以较低的性能实时完成对预览画面的文档识别,电子设备在预览画面中使用的第一文档识别算法的准确度低于在拍摄图像中使用的第二文档识别算法的准确度,电子设备仅对用户拍摄的拍摄图像以一个更精准的算法进行识别,不仅提高了用户的使用体验,也降低了文档识别所占用的性能。Through the document recognition method, device, electronic device, and storage medium disclosed in the embodiments of the present application, the electronic device can obtain a preview screen when the camera is working, and detect whether there is a first document rectangle in the preview screen according to the first document recognition algorithm. , there is a first document rectangle in the preview screen and when a photographing instruction is received, the second document rectangle in the photographed image is detected according to the second document recognition algorithm, and the second document is superimposed and displayed in the photographed image Rectangular, the calculation amount of the first document recognition algorithm used by the electronic device in the preview image is less than the calculation amount of the second document recognition algorithm in the captured image, so that the electronic device can complete the preview image in real time with low performance For document recognition, the accuracy of the first document recognition algorithm used by the electronic device in the preview screen is lower than the accuracy of the second document recognition algorithm used in the captured image, and the electronic device only uses a more accurate method for the captured image captured by the user. Recognition by advanced algorithms not only improves the user experience, but also reduces the performance occupied by document recognition.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图进行简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the accompanying drawings that need to be used in the embodiments will be briefly introduced below. Obviously, the accompanying drawings in the following description are only some embodiments of the present application. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without making creative efforts.
图1是本申请实施例公开的一种文档识别方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of a document recognition method disclosed in an embodiment of the present application;
图2是本申请实施例公开的一种文档识别方法的流程示意图;FIG. 2 is a schematic flowchart of a document identification method disclosed in an embodiment of the present application;
图3是本申请实施例公开的另一种文档识别方法的流程示意图;Fig. 3 is a schematic flowchart of another document identification method disclosed in the embodiment of the present application;
图4是本申请实施例公开的又一种文档识别方法的流程示意图;Fig. 4 is a schematic flowchart of another document identification method disclosed in the embodiment of the present application;
图5是本申请实施例公开的一种文档识别装置的模块化示意图;Fig. 5 is a modular schematic diagram of a document identification device disclosed in the embodiment of the present application;
图6是本申请实施例公开的一种电子设备的结构框图。Fig. 6 is a structural block diagram of an electronic device disclosed in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整的描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
需要说明的是,本申请实施例的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "comprising" and "having" and any variations thereof in the embodiments of the present application are intended to cover non-exclusive inclusion, for example, a process, method, system, product, or process that includes a series of steps or units. The apparatus is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to the process, method, product or apparatus.
可以理解,本申请所使用的术语“第一”、“第二”等可在本文中用于描述各种元件,但这些元件不受这些术语限制。这些术语仅用于将第一个元件与另一个元件区分。举例来说,在不脱离本申请的范围的情况下,可以将第一文档识别算法称为第二文档识别算法,且类似地,可将第二文档识别算法称为第一文档识别算法。第一文档识别算法和第二文档识别算法两者都是文档识别算法,但其不是同一文档识别算法。It can be understood that the terms "first", "second" and the like used in this application may be used to describe various elements herein, but these elements are not limited by these terms. These terms are only used to distinguish one element from another element. For example, a first document recognition algorithm could be termed a second document recognition algorithm, and, similarly, a second document recognition algorithm could be termed a first document recognition algorithm, without departing from the scope of the present application. Both the first document recognition algorithm and the second document recognition algorithm are document recognition algorithms, but they are not the same document recognition algorithm.
本申请实施例公开了一种文档识别方法、装置、电子设备及存储介质,能够降低文档识别所占用的性能。The embodiment of the present application discloses a document identification method, device, electronic equipment and storage medium, which can reduce the performance occupied by document identification.
以下将结合附图进行详细描述。A detailed description will be given below in conjunction with the accompanying drawings.
如图1所示,图1是本申请实施例公开的一种文档识别方法的应用场景示意图,该应用场景可以包括电子设备10以及文档20,该电子设备10可以可包括但不限于手机、平板电脑、可穿戴设备、笔记本电脑、PC(Personal Computer,个人计算机)等。此外,电子设备10的操作系统可包括但不限于Android(安卓)操作系统、IOS操作系统、Symbian(塞班)操作系统、Black Berry(黑莓)操作系统、Windows Phone8操作系统等,本申请实施例不作限定。其中,该电子设备10的摄像头可以对准该文档20进行工作,从而可以拍摄到带有该文档20的图像。As shown in FIG. 1, FIG. 1 is a schematic diagram of an application scenario of a document identification method disclosed in the embodiment of the present application. The application scenario may include an
其中,电子设备10当摄像头工作时,获取预览画面,再根据第一文档识别算法,检测预览画面中是否存在第一文档矩形,在预览画面中存在第一文档矩形且在接收到拍摄指令时,电子设备10根据第二文档识别算法,检测拍摄图像中的第二文档矩形,在拍摄图像中叠加显示第二文档矩形,其中,第一文档识别算法的运算量小于第二文档识别算法的运算量,第一文档识别算法的准确度低于第二文档识别算法的准确度。Wherein, when the camera is working, the
如图2所示,图2是本申请实施例公开的一种文档识别方法的流程示意图,该文档识别方法可以应用于上述实施例中的电子设备,该文档识别方法可以包括如下步骤:As shown in Figure 2, Figure 2 is a schematic flow chart of a document identification method disclosed in the embodiment of the present application. The document identification method can be applied to the electronic device in the above embodiment, and the document identification method can include the following steps:
步骤210,当摄像头工作时,获取预览画面。
电子设备在摄像头工作时,可以获取预览画面,并在电子设备的显示屏幕上进行实时显示。具体的,电子设备可以接收到调用摄像头的控制指令,并根据该控制指令控制摄像头进行工作,获取该摄像头实时拍摄的预览画面,并控制显示屏幕在对应的区域显示该摄像头实时拍摄的预览画面。When the camera of the electronic device is working, the preview image can be acquired and displayed in real time on the display screen of the electronic device. Specifically, the electronic device may receive a control command for calling the camera, and control the camera to work according to the control command, obtain a real-time preview image captured by the camera, and control the display screen to display the real-time preview image captured by the camera in a corresponding area.
步骤220,根据第一文档识别算法,检测预览画面中是否存在第一文档矩形。
电子设备可以根据第一文档识别算法,检测预览画面中是否存在第一文档矩形,该第一文档矩形可以为第一文档识别算法识别到的包括文档区域的矩形。该第一文档识别算法用于识别预览画面包括的多帧预览图像中是否存在第一文档矩形,该第一文档识别算法的种类不作限制,该第一文档识别算法可以是人工预设的。The electronic device may detect whether a first document rectangle exists in the preview screen according to the first document identification algorithm, and the first document rectangle may be a rectangle including a document area identified by the first document identification algorithm. The first document recognition algorithm is used to identify whether there is a first document rectangle in the multi-frame preview images included in the preview screen. The type of the first document recognition algorithm is not limited, and the first document recognition algorithm can be manually preset.
作为一种可选的实施方式,电子设备可以根据电子设备的性能参数以及预览画面的帧率,在多种文档识别算法中确定第一文档识别算法。其中,电子设备可以包括多个硬件的性能参数,如CPU(centralprocessing unit,中央处理器)的性能参数、GPU(graphicsprocessing unit,图形处理器)的性能参数、RAM(RandomAccess Memory,随机存取存储器)的性能参数等,预览画面的帧率指的是每秒种预览画面包括的图像帧的数量。As an optional implementation manner, the electronic device may determine the first document recognition algorithm among multiple document recognition algorithms according to the performance parameters of the electronic device and the frame rate of the preview image. Wherein, the electronic device may include multiple hardware performance parameters, such as CPU (central processing unit, central processing unit) performance parameters, GPU (graphics processing unit, graphics processing unit) performance parameters, RAM (Random Access Memory, random access memory) performance parameters, etc., the frame rate of the preview image refers to the number of image frames included in each preview image per second.
其中,电子设备可以根据各个硬件的性能参数,对电子设备的性能参数进行评分,得到电子设备的性能分。电子设备根据该性能分,可以确定各个文档识别算法对应的文档识别速度,该文档识别速度可以为进行文档识别时每秒钟识别的图像帧的数量,再根据各个文档识别算法对应的文档识别速度以及预览画面的帧率,从多个文档识别算法中确定第一文档识别算法。例如,电子设备中存在文档识别算法A、文档识别算法B以及文档识别算法C,文档识别算法A对应的文档识别速度可以为每秒15帧,文档识别算法B对应的文档识别速度可以为每秒25帧,文档识别算法C对应的文档识别速度可以为每秒35帧,预览画面的帧率为每秒30帧,则可以选择文档识别算法C作为第一文档识别算法。Wherein, the electronic device may score the performance parameters of the electronic device according to the performance parameters of each hardware to obtain the performance score of the electronic device. According to the performance score, the electronic device can determine the document recognition speed corresponding to each document recognition algorithm. The document recognition speed can be the number of image frames recognized per second when performing document recognition, and then according to the document recognition speed corresponding to each document recognition algorithm As well as the frame rate of the preview image, the first document recognition algorithm is determined from multiple document recognition algorithms. For example, there are document recognition algorithm A, document recognition algorithm B, and document recognition algorithm C in the electronic device. The document recognition speed corresponding to document recognition algorithm A can be 15 frames per second, and the document recognition speed corresponding to document recognition algorithm B can be 15 frames per second. 25 frames, the document recognition speed corresponding to the document recognition algorithm C can be 35 frames per second, and the frame rate of the preview screen is 30 frames per second, then the document recognition algorithm C can be selected as the first document recognition algorithm.
实施该实施方式,电子设备能够选择适应预览画面的第一文档识别算法进行识别,以使得电子设备的性能足够检测预览画面是否存在第一文档矩形,提高了电子设备的图像处理效率。Implementing this embodiment, the electronic device can select the first document recognition algorithm adapted to the preview screen for recognition, so that the performance of the electronic device is sufficient to detect whether there is a first document rectangle in the preview screen, and the image processing efficiency of the electronic device is improved.
步骤230,在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形。
电子设备在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形。其中,电子设备在接收到拍摄指令时,可以获取摄像头当前拍摄到的拍摄图像。需要说明的是,电子设备拍摄图像和预览图像的处理算法通常是不同的,但由于预览画面是实时拍摄的,在接收到拍摄指令时预览画面中也可以存在一帧图像与该拍摄指令对应。因此,电子设备可以在预览画面中存在第一文档矩形且接收到拍摄指令时,即电子设备可以在根据第一文档识别算法确定预览画面中存在文档,且接收到拍摄指令时,才调用第二文档识别算法进行检测。而在预览画面中不存在第一文档矩形时,即电子设备根据第一文档识别算法确定预览画面中不存在文档,即使接收到拍摄指令,电子设备仅需获取摄像头当前拍摄的拍摄图像,而不需要调用第二文档识别算法进行检测,减小了电子设备的运算量。The electronic device has the first document rectangle in the preview screen and detects the second document rectangle in the captured image according to the second document recognition algorithm when receiving the photographing instruction. Wherein, when the electronic device receives the shooting instruction, it may acquire the captured image currently captured by the camera. It should be noted that the processing algorithms of the image captured by the electronic device and the preview image are usually different, but since the preview image is captured in real time, there may also be a frame of image in the preview image corresponding to the shooting instruction when a shooting instruction is received. Therefore, when the electronic device has the first document rectangle in the preview screen and receives a shooting instruction, that is, the electronic device may call the second Document recognition algorithm for detection. And when there is no first document rectangle in the preview screen, that is, the electronic device determines that there is no document in the preview screen according to the first document recognition algorithm, even if the shooting instruction is received, the electronic device only needs to obtain the captured image currently captured by the camera, and does not The second document recognition algorithm needs to be invoked for detection, which reduces the computation load of the electronic device.
其中,第一文档识别算法的运算量小于第二文档识别算法的运算量,第一文档识别算法的准确度低于第二文档识别算法的准确度。具体的,第一文档识别算法用于处理预览画面的多帧预览图像,第二文档识别算法用于处理拍摄指令对应的拍摄图像,而拍摄指令对应的拍摄图像在数量上通常远远少于预览画面的多帧预览图像,因此,电子设备可以采用运算量较小的第一文档识别算法进行预览画面的识别,以使得电子设备的性能满足预览画面的文档识别,再采用精确度较高的第二文档识别算法进行拍摄图像的识别,以使得电子设备拍摄的拍摄图像中识别到的文档矩形更为精准。在一个具体实施例中,第一文档识别算法可以为根据像素点灰度值以确定第一文档矩形,第二文档识别算法可以为识别图像的特征点以确定第二文档矩形。Wherein, the calculation amount of the first document recognition algorithm is smaller than that of the second document recognition algorithm, and the accuracy of the first document recognition algorithm is lower than that of the second document recognition algorithm. Specifically, the first document recognition algorithm is used to process the multi-frame preview images of the preview screen, and the second document recognition algorithm is used to process the captured images corresponding to the shooting instructions, and the number of captured images corresponding to the shooting instructions is usually far less than that of the preview images. The multi-frame preview image of the screen, therefore, the electronic device can use the first document recognition algorithm with a small amount of calculation to identify the preview screen, so that the performance of the electronic device can meet the document recognition of the preview screen, and then use the higher accuracy of the first document recognition algorithm Second, the document recognition algorithm recognizes the captured image, so that the document rectangle recognized in the captured image captured by the electronic device is more accurate. In a specific embodiment, the first document recognition algorithm may determine the first document rectangle according to the gray value of the pixel, and the second document recognition algorithm may determine the second document rectangle by identifying feature points of the image.
步骤240,在拍摄图像中叠加显示第二文档矩形。
电子设备可以在拍摄图像中叠加显示第二文档矩形。可选的,电子设备可以在获取拍摄图像后,将该拍摄图像在显示屏幕上进行显示,并在该拍摄图像中叠加显示第二文档矩形。另一可选的,电子设备可以在接收到拍摄图像的显示指令后,将该拍摄图像在显示屏幕上进行显示,并在该拍摄图像中叠加显示第二文档矩形。The electronic device may superimpose and display the second document rectangle in the captured image. Optionally, after acquiring the captured image, the electronic device may display the captured image on the display screen, and superimpose and display the second document rectangle in the captured image. Alternatively, after receiving the display instruction of the captured image, the electronic device may display the captured image on the display screen, and superimpose and display the second document rectangle in the captured image.
在本申请实施例中,电子设备可以当摄像头工作时,获取预览画面,根据第一文档识别算法,检测预览画面中是否存在第一文档矩形,在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形,在拍摄图像中叠加显示第二文档矩形,电子设备在预览画面中使用的第一文档识别算法的运算量小于在拍摄图像中第二文档识别算法的运算量,使得电子设备能以较低的性能实时完成对预览画面的文档识别,电子设备在预览画面中使用的第一文档识别算法的准确度低于在拍摄图像中使用的第二文档识别算法的准确度,电子设备仅对用户拍摄的拍摄图像以一个更精准的算法进行识别,不仅提高了用户的使用体验,也降低了文档识别所占用的性能。In this embodiment of the application, the electronic device can obtain a preview image when the camera is working, and detect whether there is a first document rectangle in the preview image according to the first document recognition algorithm, and the first document rectangle exists in the preview image and receives During the shooting instruction, according to the second document recognition algorithm, the second document rectangle in the captured image is detected, and the second document rectangle is superimposed and displayed in the captured image, and the calculation amount of the first document recognition algorithm used by the electronic device in the preview screen is less than The calculation amount of the second document recognition algorithm in the captured image enables the electronic device to complete the document recognition on the preview screen in real time with low performance, and the accuracy of the first document recognition algorithm used by the electronic device in the preview screen is lower than that in the shooting image. The accuracy of the second document recognition algorithm used in the image, the electronic device only recognizes the captured image taken by the user with a more accurate algorithm, which not only improves the user experience, but also reduces the performance occupied by document recognition.
如图3所示,图3是本申请实施例公开的另一种文档识别方法的流程示意图,该文档识别方法可以应用于上述实施例中的电子设备,该文档识别方法可以包括如下步骤:As shown in Figure 3, Figure 3 is a schematic flow chart of another document identification method disclosed in the embodiment of the present application. The document identification method can be applied to the electronic device in the above embodiment, and the document identification method can include the following steps:
步骤310,当摄像头工作时,获取预览画面。
步骤310的方法与上述实施例中的步骤210的方法相同,此处不再赘述。The method in
步骤320,确定目标预览图像中灰度值满足预设条件的文档像素点。
电子设备获取的预览画面可以包括多帧预览图像,目标预览图像可以为多个多帧预览图像中的任一帧预览图像,电子设备可以确定目标预览图像中灰度值满足预设条件的文档像素点。其中,预览条件可以为人工预设的,也可以为电子设备根据目标预览图像的像素点确定的。电子设备可以计算目标预览图像中各个像素点的灰度值。可选的,电子设备可以根据各个像素点的RGB(红绿蓝)值计算各个像素点的灰度值,本申请实施例对RGB值到灰度值的计算公式不作限制。The preview image acquired by the electronic device may include a multi-frame preview image, and the target preview image may be any frame preview image in the multiple multi-frame preview images, and the electronic device may determine the document pixels whose grayscale values in the target preview image meet the preset conditions point. Wherein, the preview condition may be manually preset, or determined by the electronic device according to the pixels of the target preview image. The electronic device can calculate the gray value of each pixel in the target preview image. Optionally, the electronic device may calculate the gray value of each pixel according to the RGB (red, green, blue) value of each pixel, and the embodiment of the present application does not limit the calculation formula from the RGB value to the gray value.
在一个实施例中,电子设备可以计算目标预览图像中所有像素点的平均灰度值,再将各个像素点的灰度值减去平均灰度值,得到各个像素点对应的计算结果,并将对应的计算结果大于预设阈值的像素点,作为灰度值满足预设条件的文档像素点。其中,电子设备可以先确定目标预览图像中各个像素点的灰度值,再计算所有像素点的平均灰度值,预设阈值可以为人工预设的,也可以为根据该平均灰度值确定的,例如,预设阈值可以为平均灰度值的5%的大小。电子设备将各个像素点的灰度值减去平均灰度值,得到各个像素点对应的计算结果,计算结果可以为正值,也可以为负值,例如,目标预览图像包括多个像素点,平均灰度值为100,预设阈值为15其中3个像素点的灰度值分别为50、80及120,则计算结果分别为-50、-30以及20,则灰度值为120的像素点可以作为文档像素点。实施该实施例,根据目标预览图像中的像素点的灰度值。In one embodiment, the electronic device may calculate the average gray value of all pixels in the target preview image, and then subtract the average gray value from the gray value of each pixel to obtain a calculation result corresponding to each pixel, and A pixel whose corresponding calculation result is greater than the preset threshold is regarded as a document pixel whose gray value satisfies the preset condition. Among them, the electronic device can first determine the gray value of each pixel in the target preview image, and then calculate the average gray value of all pixels. The preset threshold can be manually preset, or it can be determined according to the average gray value. For example, the preset threshold may be 5% of the average gray value. The electronic device subtracts the average gray value from the gray value of each pixel to obtain a calculation result corresponding to each pixel. The calculation result can be a positive value or a negative value. For example, the target preview image includes multiple pixels. The average gray value is 100, and the default threshold is 15. The gray values of 3 pixels are 50, 80 and 120 respectively, and the calculation results are -50, -30 and 20 respectively, and the gray value is 120. Points can be used as document pixels. This embodiment is implemented, according to the gray value of the pixel in the target preview image.
步骤330,根据文档像素点在目标预览图像中的密集程度,在目标预览图像中确定是否存在第一文档矩形。Step 330: Determine whether the first document rectangle exists in the target preview image according to the density of document pixels in the target preview image.
电子设备可以根据文档像素点在目标预览图像中的密集程度,在目标预览图像中确定是否存在第一文档矩形。该密集程度可以电子设备根据文档像素点在目标预览图像中的分布情况确定的,也可以是根据各个文档像素点之间的距离确定的,还可以是根据深度学习训练得到的密度识别模型确定的,本申请实施例对此不作限制。The electronic device may determine whether the first document rectangle exists in the target preview image according to the density of document pixels in the target preview image. The density can be determined by the electronic device according to the distribution of document pixels in the target preview image, or can be determined according to the distance between each document pixel, or can be determined according to the density recognition model obtained through deep learning training , which is not limited in this embodiment of the present application.
作为一种可选的实施方法,电子设备可以将目标预览图像划分为多个区域,并确定各个区域的文档像素点在各个区域所有像素点中的占比,将该各个区域对应的占比作为各个区域的密集程度,若存在占比大于占比阈值的区域,则目标预览图像中确定存在第一文档矩形。实施该实施方式,电子设备可以根据预先划分的区域确定目标预览图像中是否存在第一文档矩形,降低预算复杂度。As an optional implementation method, the electronic device may divide the target preview image into multiple regions, and determine the ratio of document pixels in each region to all pixels in each region, and use the ratio corresponding to each region as The density of each area, if there is an area whose proportion is greater than the proportion threshold, it is determined that the first document rectangle exists in the target preview image. By implementing this embodiment, the electronic device can determine whether the first document rectangle exists in the target preview image according to the pre-divided area, thereby reducing the complexity of the budget.
步骤340,若目标预览图像中确定存在第一文档矩形,在目标预览图像中叠加显示第一文档矩形。
若目标预览图像中确定存在第一文档矩形,则电子设备可以确定该第一文档矩形的位置,并在目标预览图像中叠加显示第一文档矩形,即预览画面中可以实时显示第一文档矩形。若目标预览图像中确定不存在第一文档矩形,电子设备的预览画面中也不存在第一文档矩形,可以通过是否显示第一文档矩形以提示用户当前的预览画面中是否检测到文档。其中,第一文档矩形的线条类型可以包括直线、波浪线、虚线等,对此不作限制,第一文档矩形的线条颜色也不作限制。可选的,电子设备可以获取第一文档矩形中边缘区域的像素点的RGB值,并根据该边缘区域的像素点的RGB值确定第一文档矩形的线条颜色。If it is determined that the first document rectangle exists in the target preview image, the electronic device may determine the position of the first document rectangle, and superimpose and display the first document rectangle in the target preview image, that is, the first document rectangle may be displayed in real time on the preview screen. If it is determined that the first document rectangle does not exist in the target preview image, nor does the first document rectangle exist in the preview screen of the electronic device, the user may be prompted whether a document is detected in the current preview screen by whether to display the first document rectangle. Wherein, the line type of the first document rectangle may include straight line, wavy line, dashed line, etc., which is not limited, and the line color of the first document rectangle is also not limited. Optionally, the electronic device may acquire the RGB values of the pixels in the edge area of the first document rectangle, and determine the line color of the first document rectangle according to the RGB values of the pixels in the edge area.
步骤350,在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形。
步骤360,在拍摄图像中叠加显示第二文档矩形。
步骤350~360的方法与上述实施例中的步骤230~240的方法相同,此处不再赘述。The methods in steps 350-360 are the same as the methods in steps 230-240 in the above embodiment, and will not be repeated here.
在本申请实施例中,电子设备还可以确定目标预览图像中灰度值满足预设条件的文档像素点,并根据文档像素点在目标预览图像中的密集程度,在目标预览图像中确定是否存在第一文档矩形,该文档识别算法的运算量较小,降低了文档识别算法占用的电子设备的性能,从而使得电子设备的性能能够满足对预览画面的多帧预览图像的实时识别,在目标预览图像中确定存在第一文档矩形的情况下,电子设备还可以在目标预览图像中叠加显示该第一文档矩形,进一步地提高了用户的体验。In this embodiment of the application, the electronic device can also determine the document pixels whose grayscale values meet the preset conditions in the target preview image, and determine whether there is The first document rectangle, the calculation amount of the document recognition algorithm is small, which reduces the performance of the electronic equipment occupied by the document recognition algorithm, so that the performance of the electronic equipment can meet the real-time recognition of the multi-frame preview image of the preview screen, in the target preview When it is determined that the first document rectangle exists in the image, the electronic device may also superimpose and display the first document rectangle in the target preview image, which further improves user experience.
如图4所示,图本申请实施例公开的又一种文档识别方法的流程示意图,该文档识别方法可以在目标预览图像中存在第一文档矩形的情况下执行,可以应用于上述实施例中的电子设备,该文档识别方法可以包括如下步骤:As shown in FIG. 4 , it is a schematic flowchart of another document recognition method disclosed in the embodiment of the present application. The document recognition method can be executed when there is a first document rectangle in the target preview image, and can be applied to the above-mentioned embodiments. An electronic device, the document identification method may include the following steps:
步骤410,当摄像头工作时,获取预览画面。
步骤420,确定目标预览图像中灰度值满足预设条件的文档像素点。
步骤430,根据文档像素点在目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形。Step 430: Determine at least one document rectangle in the target preview image according to the density of document pixels in the target preview image.
电子设备可以根据文档像素点在目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形。其中,电子设备可以确定目标预览图像中密度程度达到密度要求的目标区域,在目标预览图像中存在第一文档矩形的情况下,该目标区域可以为一个或多个,每个目标区域都可以对应一个文档矩形,可选的,若两个目标区域之间的距离小于距离阈值,也可以根据两个目标区域确定一个文档矩形,对此不作限制。The electronic device may determine at least one document rectangle in the target preview image according to the density of document pixels in the target preview image. Wherein, the electronic device may determine the target area whose density reaches the density requirement in the target preview image. If there is a first document rectangle in the target preview image, there may be one or more target areas, and each target area may correspond to A document rectangle. Optionally, if the distance between the two target areas is smaller than the distance threshold, a document rectangle may also be determined according to the two target areas, and there is no limitation on this.
在一个实施例中,电子设备可以通过密度识别模型对文档像素点在目标预览图像中的密集程度进行分析,识别到文档像素点密度大于密度阈值的至少一个模型识别区域,再在目标预览图像中确定与至少一个模型识别区域一一对应的至少一个文档矩形。In one embodiment, the electronic device can analyze the denseness of document pixels in the target preview image through a density recognition model, identify at least one model recognition region with a document pixel density greater than a density threshold, and then identify the document in the target preview image At least one document rectangle corresponding to at least one model recognition area is determined.
其中,密度识别模型是根据训练数据集预先训练好的模型,该训练数据集可以为多张带有标签的训练图像,本申请实施例对此不进行赘述。电子设备对文档像素点字目标预览图像中的密集程度进行分析,在目标预览图像中存在第一文档矩形的情况下,可以得到至少一个模型识别区域,该模型识别区域的形状可以包括但不限于圆形、菱形、其它规则形状以及不规则形状,对此不作限制,电子设备根据各个模型识别区域中的文档像素点,可以确定各个模型识别区域对应的文档矩形。Wherein, the density recognition model is a pre-trained model according to the training data set, and the training data set may be multiple labeled training images, which will not be described in this embodiment of the present application. The electronic device analyzes the denseness of the document pixel in the target preview image, and if there is a first document rectangle in the target preview image, at least one model recognition area can be obtained, and the shape of the model recognition area can include but not limited to a circle Shape, rhombus, other regular shapes and irregular shapes are not limited, and the electronic device can determine the document rectangle corresponding to each model recognition region according to the document pixels in each model recognition region.
作为一种可选的实施方式,步骤在目标预览图像中确定与至少一个模型识别区域一一对应的至少一个文档矩形,可以包括:根据目标模型识别区域中最左侧的像素点,确定目标模型识别区域对应的文档矩形的左边界;根据目标模型识别区域中最右侧的像素点,确定目标模型识别区域对应的文档矩形的右边界;根据目标模型识别区域中最上侧的像素点,确定目标模型识别区域对应的文档矩形的上边界;根据目标模型识别区域中最下侧的像素点,确定目标模型识别区域对应的文档矩形的下边界;根据左边界、右边界、上边界以及下边界,确定目标模型识别区域对应的文档矩形。As an optional implementation manner, the step of determining at least one document rectangle corresponding to at least one model recognition area in the target preview image may include: determining the target model according to the leftmost pixel in the target model recognition area The left boundary of the document rectangle corresponding to the recognition area; determine the right boundary of the document rectangle corresponding to the target model recognition area according to the rightmost pixel point in the target model recognition area; determine the target according to the uppermost pixel point in the target model recognition area The upper boundary of the document rectangle corresponding to the model recognition area; determine the lower boundary of the document rectangle corresponding to the target model recognition area according to the lowermost pixel in the target model recognition area; according to the left boundary, right boundary, upper boundary and lower boundary, Determine the document rectangle corresponding to the target model recognition area.
其中,目标模型识别区域为至少一个模型识别区域中任一模型识别区域。电子设备可以根据目标预览图像建立像素坐标系,该像素坐标系可以以目标预览图像的左上角为原点建立以像素为单元的直角坐标系。电子设备根据目标模型识别区域中各个像素点的横坐标,确定最左侧的像素点,并根据该最左侧的像素点的横坐标确定左边界。电子设备确定右边界、上边界以及下边界的方法与确定左边界的方法相似,对此不再赘述。电子设备可以将左边界和上边界的交点作为文档矩形的左上角顶点,将左边界和下边界的交点作为文档矩形的左下角顶点,将右边界和上边界的交点作为文档矩形的右上角顶点,将右边界和下边界的交点作为文档矩形的右下角顶点。电子设备根据左上角顶点、左下角顶点、右上角顶点以及右下角顶点,可以确定文档矩形。Wherein, the target model recognition area is any model recognition area in at least one model recognition area. The electronic device may establish a pixel coordinate system according to the target preview image, and the pixel coordinate system may establish a Cartesian coordinate system with a pixel as a unit with the upper left corner of the target preview image as an origin. The electronic device determines the leftmost pixel according to the abscissa of each pixel in the target model recognition area, and determines the left boundary according to the abscissa of the leftmost pixel. The method for the electronic device to determine the right boundary, the upper boundary and the lower boundary is similar to the method for determining the left boundary, which will not be repeated here. The electronic device may use the intersection point of the left boundary and the upper boundary as the upper-left vertex of the document rectangle, the intersection point of the left boundary and the lower boundary as the lower-left vertex of the document rectangle, and the intersection point of the right boundary and the upper boundary as the upper-right vertex of the document rectangle , taking the intersection of the right border and the bottom border as the bottom right vertex of the document rectangle. The electronic device may determine the document rectangle according to the upper-left vertex, the lower-left vertex, the upper-right vertex, and the lower-right vertex.
步骤440,确定各个文档矩形的对应的文档像素点密度、色差对比度以及置信度。
在一个实施例中,电子设备可以根据各个文档矩形中的文档像素点数量,计算各个文档矩形对应的文档像素点密度,并根据各个文档矩形内像素点的颜色值和各个文档矩形外像素点的颜色值,计算各个文档矩形对应的色差对比度,再根据各个文档矩形的文字信息以及灰度信息,得到各个文档矩形对应的置信度。In one embodiment, the electronic device may calculate the document pixel density corresponding to each document rectangle according to the number of document pixels in each document rectangle, and calculate the pixel density of each document rectangle according to the color value of each pixel within each document rectangle and the color value of each pixel outside each document rectangle. Color value, calculate the color difference contrast corresponding to each document rectangle, and then obtain the confidence corresponding to each document rectangle according to the text information and grayscale information of each document rectangle.
电子设备可以获取各个文档矩形的像素点总数量,将文档像素点数量除以该像素点数量,可以得到文档矩形对应的像素点密度。像素点的颜色值可以为像素点的RGB值,文档矩形内像素点可以指的是文档矩形内边缘区域的像素点,也可以指的是文档矩形内的所有像素点,文档矩形外像素点可以指的是文档矩形外周围区域的像素点,也可以指的是目标预览图像中的所有像素点。其中,文档矩形内边缘区域可以包括文档矩形内距离任一边界不超过第一预设像素点数量的像素点,例如,文档矩形内距离任一边界不超过3个像素点的像素点,文档矩形外周围区域可以包括文档矩形外距离任一边界不超过第二预设像素点数量的像素点。电子设备还可以检测文档矩形中的文字信息,该文字信息可以包括但不限于文字数量、文字类型、文字大小等,灰度信息可以指的是文档矩形中所有像素点的灰度值,根据该文字信息以及灰度信息,可以得到文档矩形对应的置信度。The electronic device can obtain the total number of pixels of each document rectangle, and divide the number of document pixels by the number of pixels to obtain the pixel density corresponding to the document rectangle. The color value of the pixel can be the RGB value of the pixel. The pixel in the document rectangle can refer to the pixel in the edge area of the document rectangle, or it can refer to all the pixels in the document rectangle. The pixel outside the document rectangle can be Refers to the pixels in the surrounding area outside the document rectangle, and can also refer to all the pixels in the target preview image. Wherein, the inner edge area of the document rectangle may include pixels within the document rectangle whose distance from any boundary does not exceed the first preset number of pixels, for example, pixels within the document rectangle that are no more than 3 pixels away from any boundary, and the document rectangle The outer surrounding area may include pixels outside the document rectangle whose distance from any border does not exceed a second preset number of pixels. The electronic device can also detect the text information in the document rectangle, the text information can include but not limited to the number of text, text type, text size, etc., the grayscale information can refer to the grayscale values of all pixels in the document rectangle, according to the Text information and grayscale information can be used to obtain the confidence corresponding to the document rectangle.
步骤450,根据各个文档矩形的对应的文档像素点密度、色差对比度以及置信度,加权计算各个文档矩形的目标结果。
其中,文档像素点密度、色差对比度以及置信度可以分别对应有权重,文档像素点密度、色差对比度以及置信度分别对应的权重可以是预先设置的,可选的,文档像素点密度对应的权重可以大于色差对比度以及置信度分别对应的权重,电子设备可以将文档像素点密度作为目标结果的主要影响因子。Among them, the document pixel density, color contrast and confidence can be respectively corresponding to weights, and the weights corresponding to the document pixel density, color contrast and confidence respectively can be preset. Optionally, the weight corresponding to the document pixel density can be Greater than the weights corresponding to the color difference contrast and the confidence respectively, the electronic device can use the document pixel density as the main influencing factor of the target result.
步骤460,将多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形。
电子设备可以将多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形。The electronic device may use the document rectangle with the largest target result among the plurality of document rectangles as the first document rectangle.
步骤470,在目标预览图像中叠加显示第一文档矩形。
步骤480,在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形。Step 480 : There is a first document rectangle in the preview image and when a photographing instruction is received, detect a second document rectangle in the photographed image according to a second document recognition algorithm.
步骤490,在拍摄图像中叠加显示第二文档矩形。
步骤470~490的方法与上述实施例中的步骤340~360的方法相同,此处不再赘述。The methods in steps 470-490 are the same as the methods in steps 340-360 in the above embodiment, and will not be repeated here.
在本申请实施例中,电子设备根据文档像素点在目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形,并确定各个文档矩形的对应的文档像素点密度、色差对比度以及置信度,以加权计算各个文档矩形的目标结果,再将多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形,提高了第一文档矩形的准确率,在第二文档矩形较为准确的情况下,使得第一文档矩形与第二文档矩形更为接近,从而提高用户体验的一致性。In the embodiment of the present application, the electronic device determines at least one document rectangle in the target preview image according to the denseness of document pixels in the target preview image, and determines the corresponding document pixel density, color difference contrast and confidence of each document rectangle. Degree, calculate the target result of each document rectangle by weighting, and then use the document rectangle with the largest target result among multiple document rectangles as the first document rectangle, which improves the accuracy of the first document rectangle, and the second document rectangle is more accurate. In some cases, the first document rectangle is closer to the second document rectangle, thereby improving the consistency of user experience.
如图5所示,图5是本申请实施例公开的一种文档识别装置的模块化示意图,该文档识别装置500包括获取模块510、第一检测模块520、第二检测模块530以及显示模块540,其中:As shown in FIG. 5 , FIG. 5 is a modular schematic diagram of a document recognition device disclosed in an embodiment of the present application. The
获取模块510,用于当摄像头工作时,获取预览画面;Obtaining
第一检测模块520,用于根据第一文档识别算法,检测预览画面中是否存在第一文档矩形;The
第二检测模块530,用于在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形;The
显示模块540,用于在拍摄图像中叠加显示第二文档矩形;A
其中,第一文档识别算法的运算量小于第二文档识别算法的运算量,第一文档识别算法的准确度低于第二文档识别算法的准确度。Wherein, the calculation amount of the first document recognition algorithm is smaller than that of the second document recognition algorithm, and the accuracy of the first document recognition algorithm is lower than that of the second document recognition algorithm.
在一个实施例中,预览画面包括多帧预览图像,第一检测模块520,还用于确定目标预览图像中灰度值满足预设条件的文档像素点;其中,目标预览图像为多帧预览图像中的任一帧预览图像;根据文档像素点在目标预览图像中的密集程度,在目标预览图像中确定是否存在第一文档矩形。In one embodiment, the preview image includes a multi-frame preview image, and the
显示模块540,还用于在目标预览图像中叠加显示第一文档矩形。The
在一个实施例中,第一检测模块520,还用于计算目标预览图像中所有像素点的平均灰度值;将各个像素点的灰度值减去平均灰度值,得到各个像素点对应的计算结果;将对应的计算结果大于预设阈值的像素点,作为灰度值满足预设条件的文档像素点。In one embodiment, the
在一个实施例中,在目标预览图像中存在第一文档矩形的情况下,第一检测模块520,还用于根据文档像素点在目标预览图像中的密集程度,确定目标预览图像中的至少一个文档矩形;确定各个文档矩形的对应的文档像素点密度、色差对比度以及置信度;根据各个文档矩形的对应的文档像素点密度、色差对比度以及置信度,加权计算各个文档矩形的目标结果;将多个文档矩形中目标结果最大的文档矩形,作为第一文档矩形。In one embodiment, in the case that the first document rectangle exists in the target preview image, the
在一个实施例中,第一检测模块520,还用于根据各个文档矩形中的文档像素点数量,计算各个文档矩形对应的文档像素点密度;根据各个文档矩形内像素点的颜色值和各个文档矩形外像素点的颜色值,计算各个文档矩形对应的色差对比度;根据各个文档矩形的文字信息以及灰度信息,得到各个文档矩形对应的置信度。In one embodiment, the
在一个实施例中,第一检测模块520,还用于通过密度识别模型对文档像素点在目标预览图像中的密集程度进行分析,识别到文档像素点密度大于密度阈值的至少一个模型识别区域;在目标预览图像中确定与至少一个模型识别区域一一对应的至少一个文档矩形。In one embodiment, the
在一个实施例中,第一检测模块520,还用于根据目标模型识别区域中最左侧的像素点,确定目标模型识别区域对应的文档矩形的左边界;其中,目标模型识别区域为至少一个模型识别区域中任一模型识别区域;根据目标模型识别区域中最右侧的像素点,确定目标模型识别区域对应的文档矩形的右边界;根据目标模型识别区域中最上侧的像素点,确定目标模型识别区域对应的文档矩形的上边界;根据目标模型识别区域中最下侧的像素点,确定目标模型识别区域对应的文档矩形的下边界;根据左边界、右边界、上边界以及下边界,确定目标模型识别区域对应的文档矩形。In one embodiment, the
在本申请实施例中,电子设备可以当摄像头工作时,获取预览画面,根据第一文档识别算法,检测预览画面中是否存在第一文档矩形,在预览画面中存在第一文档矩形且在接收到拍摄指令时,根据第二文档识别算法,检测拍摄图像中的第二文档矩形,在拍摄图像中叠加显示第二文档矩形,电子设备在预览画面中使用的第一文档识别算法的运算量小于在拍摄图像中第二文档识别算法的运算量,使得电子设备能以较低的性能实时完成对预览画面的文档识别,电子设备在预览画面中使用的第一文档识别算法的准确度低于在拍摄图像中使用的第二文档识别算法的准确度,电子设备仅对用户拍摄的拍摄图像以一个更精准的算法进行识别,不仅提高了用户的使用体验,也降低了文档识别所占用的性能。In this embodiment of the application, the electronic device can obtain a preview image when the camera is working, and detect whether there is a first document rectangle in the preview image according to the first document recognition algorithm, and the first document rectangle exists in the preview image and receives During the shooting instruction, according to the second document recognition algorithm, the second document rectangle in the captured image is detected, and the second document rectangle is superimposed and displayed in the captured image, and the calculation amount of the first document recognition algorithm used by the electronic device in the preview screen is less than The calculation amount of the second document recognition algorithm in the captured image enables the electronic device to complete the document recognition on the preview screen in real time with low performance, and the accuracy of the first document recognition algorithm used by the electronic device in the preview screen is lower than that in the shooting image. The accuracy of the second document recognition algorithm used in the image, the electronic device only recognizes the captured image taken by the user with a more accurate algorithm, which not only improves the user experience, but also reduces the performance occupied by document recognition.
如图6所示,在一个实施例中,提供一种电子设备,该电子设备可以包括:As shown in FIG. 6, in one embodiment, an electronic device is provided, and the electronic device may include:
存储有可执行程序代码的存储器610;a
与存储器610耦合的处理器620;a
处理器620调用存储器810中存储的可执行程序代码,可实现如上述各实施例中提供的文档识别方法。The
存储器1010可以包括随机存储器(RandomAccess Memory,RAM),也可以包括只读存储器(Read-OnlyMemory,ROM)。存储器610可用于存储指令、程序、代码、代码集或指令集。存储器610可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现上述各个方法实施例的指令等。存储数据区还可以存储电子设备在使用中所创建的数据等。The memory 1010 may include a random access memory (Random Access Memory, RAM), and may also include a read-only memory (Read-Only Memory, ROM).
处理器620可以包括一个或者多个处理核。处理器620利用各种接口和线路连接整个电子设备内的各个部分,通过运行或执行存储在存储器610内的指令、程序、代码集或指令集,以及调用存储在存储器610内的数据,执行电子设备的各种功能和处理数据。可选地,处理器620可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable LogicArray,PLA)中的至少一种硬件形式来实现。处理器620可集成中央处理器(CentralProcessing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责显示内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器620中,单独通过一块通信芯片进行实现。
可以理解地,电子设备可包括比上述结构框图中更多或更少的结构元件,例如,包括电源模块、物理按键、WiFi(Wireless Fidelity,无线保真)模块、扬声器、蓝牙模块、传感器等,还可在此不进行限定。It can be understood that the electronic device may include more or less structural elements than those in the above structural block diagram, for example, including a power module, physical buttons, WiFi (Wireless Fidelity, wireless fidelity) module, speaker, Bluetooth module, sensor, etc., It can also not be limited here.
本申请实施例公开一种计算机可读存储介质,其存储计算机程序,其中,该计算机程序使得计算机执行上述各实施例中所描述的方法。The embodiment of the present application discloses a computer-readable storage medium, which stores a computer program, where the computer program causes a computer to execute the methods described in the foregoing embodiments.
此外,本申请实施例进一步公开一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机可以执行上述实施例所描述的任意一种文档识别方法中的全部或部分步骤。In addition, the embodiment of the present application further discloses a computer program product. When the computer program product is run on a computer, the computer can execute all or part of the steps in any one of the document identification methods described in the above embodiments.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质包括只读存储器(Read-Only Memory,ROM)、随机存储器(RandomAccess Memory,RAM)、可编程只读存储器(Programmable Read-only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、一次可编程只读存储器(One-timeProgrammable Read-Only Memory,OTPROM)、电子抹除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(CompactDisc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage medium includes read-only Memory (Read-Only Memory, ROM), Random Access Memory (Random Access Memory, RAM), Programmable Read-Only Memory (Programmable Read-only Memory, PROM), Erasable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM ), One-time Programmable Read-Only Memory (OTPROM), Electronically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
以上对本申请实施例公开的一种文档识别方法、装置、电子设备及存储介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。A document recognition method, device, electronic device and storage medium disclosed in the embodiments of the present application have been described above in detail. In this paper, specific examples are used to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used To help understand the method of the present application and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in the specific implementation and application scope. In summary, the content of this specification does not It should be understood as a limitation on the present application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211729300.4A CN115965989A (en) | 2022-12-30 | 2022-12-30 | Document identification method, device, electronic device and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211729300.4A CN115965989A (en) | 2022-12-30 | 2022-12-30 | Document identification method, device, electronic device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN115965989A true CN115965989A (en) | 2023-04-14 |
Family
ID=87354069
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211729300.4A Pending CN115965989A (en) | 2022-12-30 | 2022-12-30 | Document identification method, device, electronic device and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115965989A (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080225353A1 (en) * | 2007-03-14 | 2008-09-18 | Canon Kabushiki Kaisha | Image reading apparatus and image reading method |
| US20110149320A1 (en) * | 2009-12-18 | 2011-06-23 | Konica Minolta Business Technologies, Inc. | Image forming device, method of forming image, and recording medium storing control program for controlling image forming device |
| WO2017067275A1 (en) * | 2015-10-21 | 2017-04-27 | 广州视睿电子科技有限公司 | Document image display method, device and terminal |
| US20180211243A1 (en) * | 2017-01-26 | 2018-07-26 | Ncr Corporation | Document Image Capture and Processing |
| CN111935407A (en) * | 2020-08-21 | 2020-11-13 | 深圳传音控股股份有限公司 | Photographing processing method, mobile terminal and storage medium |
-
2022
- 2022-12-30 CN CN202211729300.4A patent/CN115965989A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080225353A1 (en) * | 2007-03-14 | 2008-09-18 | Canon Kabushiki Kaisha | Image reading apparatus and image reading method |
| US20110149320A1 (en) * | 2009-12-18 | 2011-06-23 | Konica Minolta Business Technologies, Inc. | Image forming device, method of forming image, and recording medium storing control program for controlling image forming device |
| WO2017067275A1 (en) * | 2015-10-21 | 2017-04-27 | 广州视睿电子科技有限公司 | Document image display method, device and terminal |
| US20180211243A1 (en) * | 2017-01-26 | 2018-07-26 | Ncr Corporation | Document Image Capture and Processing |
| CN111935407A (en) * | 2020-08-21 | 2020-11-13 | 深圳传音控股股份有限公司 | Photographing processing method, mobile terminal and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR100556856B1 (en) | Method and device for screen control in mobile communication terminal | |
| KR100947990B1 (en) | Gaze Tracking Device Using Differential Image Entropy and Its Method | |
| JP6374986B2 (en) | Face recognition method, apparatus and terminal | |
| US10440284B2 (en) | Determination of exposure time for an image frame | |
| CN112135041B (en) | Method and device for processing special effect of human face and storage medium | |
| CN107330859B (en) | An image processing method, device, storage medium and terminal | |
| CN109951636A (en) | Photographing processing method and device, mobile terminal and storage medium | |
| CN106339224B (en) | Readability enhancing method and device | |
| CN106713696A (en) | Image processing method and device | |
| CN106228158A (en) | The method and apparatus of picture detection | |
| EP3989591A1 (en) | Resource display method, device, apparatus, and storage medium | |
| CN111461070A (en) | Text recognition method, device, electronic device and storage medium | |
| CN110782392B (en) | Image processing method, device, electronic equipment and storage medium | |
| CN106778627A (en) | Detect method, device and the mobile terminal of face face value | |
| CN110177205A (en) | Terminal device, photographic method and computer readable storage medium based on micro- expression | |
| KR102807669B1 (en) | Electronic device and operating method for generating high dynamic range image | |
| US10438377B2 (en) | Method and device for processing a page | |
| JP2018151994A (en) | Image processing method, image processing program, and image processor | |
| CN115100712A (en) | Expression recognition method and device, electronic equipment and storage medium | |
| CN111870950B (en) | Game control display control method and device and electronic equipment | |
| US20170109569A1 (en) | Hybrid face recognition based on 3d data | |
| CN116261742A (en) | Information processing apparatus and information processing method | |
| CN115965989A (en) | Document identification method, device, electronic device and storage medium | |
| CN105096355A (en) | Image processing method and system | |
| CN106228518B (en) | Readable Enhancement Method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |