[go: up one dir, main page]

CN113723362A - Method and device for detecting table line in image - Google Patents

Method and device for detecting table line in image Download PDF

Info

Publication number
CN113723362A
CN113723362A CN202111134050.5A CN202111134050A CN113723362A CN 113723362 A CN113723362 A CN 113723362A CN 202111134050 A CN202111134050 A CN 202111134050A CN 113723362 A CN113723362 A CN 113723362A
Authority
CN
China
Prior art keywords
line
lines
image
unit
semantic segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111134050.5A
Other languages
Chinese (zh)
Inventor
龙伟
郭丰俊
丁凯
龙腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Linguan Data Technology Co ltd
Shanghai Shengteng Data Technology Co ltd
Shanghai Yingwuchu Data Technology Co ltd
Intsig Information Co Ltd
Shanghai Hehe Information Technology Development Co Ltd
Original Assignee
Shanghai Linguan Data Technology Co ltd
Shanghai Shengteng Data Technology Co ltd
Shanghai Yingwuchu Data Technology Co ltd
Shanghai Hehe Information Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Linguan Data Technology Co ltd, Shanghai Shengteng Data Technology Co ltd, Shanghai Yingwuchu Data Technology Co ltd, Shanghai Hehe Information Technology Development Co Ltd filed Critical Shanghai Linguan Data Technology Co ltd
Priority to CN202111134050.5A priority Critical patent/CN113723362A/en
Publication of CN113723362A publication Critical patent/CN113723362A/en
Priority to PCT/CN2022/085400 priority patent/WO2023045298A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method for detecting a table line in an image. Step S10: and inputting the image into a semantic segmentation network to obtain a pixel set of the adjacent area of the potential table line. Step S20: and performing line segment fitting on the pixel set in the area adjacent to the table line to obtain the table line. Step S30: and removing the false table lines to obtain real table lines. Step S40: and respectively classifying all table lines into groups of all rows and all columns. Step S50: resulting in a complete structured spreadsheet. Step S60: if the spreadsheet in step S50 fails to be structured and is caused by error of detection of the form line, the typical features of the failure scene are extracted, and therefore, a difficult sample is generated, and the semantic segmentation network is retrained. By repeatedly training the semantic segmentation network, the method improves the accuracy of table line detection and is beneficial to improving the success rate of electronic table structuring.

Description

Method and device for detecting table line in image
Technical Field
The present application relates to a method of detecting a form line in an image (picture).
Background
Forms have wide application in daily life and office, and a great deal of demand exists for converting forms in pictures into electronic forms, and the automatic conversion technology generally depends heavily on the detection of form lines. The table lines include outer border lines for separating the inside of the table from the outside of the table, and inner separation lines for separating rows and columns inside the table.
Because of the image quality, the shooting angle, the uneven light, the bending and folding of the paper, the dislocation of the character area, the stamp watermark interference and the diversity of the color, thickness and style of the form line, the method can bring great challenge to the detection of the form line, and further influence the accuracy of the structure reduction of the form.
Disclosure of Invention
The technical problem to be solved by the application is to provide a method for detecting a table line in an image, which has the characteristics of high accuracy and capability of effectively assisting in table structure reduction.
In order to solve the above technical problem, the method for detecting a table line in an image proposed by the present application includes the following steps. Step S10: inputting the image into a semantic segmentation network to obtain a pixel set of a region adjacent to a potential table line; the set of pixels in the vicinity of the potential table line refers to isolated pixels in some regions where the table line may exist. Step S20: and performing line segment fitting on the pixel set in the area adjacent to the table line to obtain the table line. Step S30: and filtering the table lines obtained in the step S20 according to character line information obtained by carrying out optical character recognition on the image, and removing false table lines to obtain real table lines. Step S40: and according to the position relation among the table lines, respectively classifying all the table lines into groups of each row and each column. Step S50: and constructing cells according to the group to which the table lines belong, and storing the optical character recognition result in each cell range as character information in the cell to finally obtain the complete structured spreadsheet. Step S60: if the electronic form structuring in the step S50 fails and is caused by a form line detection error, extracting the typical features of the failure scene, generating a difficult sample according to the typical features, retraining the semantic segmentation network, and repeating the steps S10 to S50 by using the retrained semantic segmentation network until the electronic form structuring in the step S50 succeeds. By repeatedly training the semantic segmentation network, the method improves the accuracy of table line detection and is beneficial to improving the success rate of electronic table structuring.
Further, in step S10, the semantic segmentation of the image is to classify each pixel point in the image, determine the category of each point, and thereby perform region division; the semantic segmentation network is based on a deep learning algorithm and comprises one or more of a convolutional neural network, a deep convolutional neural network and a full convolutional network. This is a detailed description of step S10.
Further, in step S30, the character row information includes any one or more of a height of the character row, a width of a single character, and an angle of the character row.
Further, in step S40, the horizontal lines are sorted according to the starting end points and then processed in a loop, and when the horizontal lines are close in vertical distance and have overlapping horizontal portions, the horizontal lines are merged and deduplicated, so that the horizontal lines which logically belong to the same horizontal line but are actually detected as a plurality of horizontal lines are assembled into one horizontal line; finally, the horizontal lines of each table row are grouped into a group, and one or more horizontal lines are contained in the group according to the condition that whether the cells are combined or not; a similar approach is used for the processing of vertical lines. This is a specific explanation of step S40.
Optionally, in step S40, the process is accelerated by using a union-search algorithm.
Further, the step S60 further includes the following sub-steps. Step S61: a generic sample synthesis tool is prepared, the difficult sample synthesis tool having a plurality of adjustable parameters by which samples and labels of various features can be generated. Step S62: typical features in the scenario of a spreadsheet structuring failure due to a form line detection error are collected and analyzed. Step S63: according to the typical characteristics of the failure scenario obtained in step S62, parameters in the generic sample synthesis tool are adjusted to generate difficult samples and labels with the same characteristics. Step S64: retraining the semantic segmentation network for obtaining a set of pixels in the vicinity of a potential form line in the image with the generated difficult samples. This is a specific explanation of step S60.
Further, in step S61, the difficult sample synthesis tool abstracts the sample generation process into five parts, namely, basic background texture, table structure, text content and style, table line position and style, and stamp watermark synthesis; the parameters of the basic background texture part comprise any one or more of background pictures, background colors, texture patterns and texture colors; the parameters of the table structure part comprise any one or more of the number, the size, the position, the row and column number and the condition of merging cells of the table; the parameters of the text content and the style part comprise any one or more of a font size, a font style, a color, a position and an alignment mode; the parameters of the form line position and style part comprise any one or more of type style, thickness and pixel area of the form line; the parameters of the stamp watermark composition portion include any one or more of the number, position, angle, color of the stamp watermark.
Further, in step S62, the characteristic features of the failure scene include any one or more of a word line caused by printing misalignment or handwriting, a false line caused by a long-stroke repeated arrangement of chinese characters, a missing line caused by stamp blocking, erroneously identifying a stamp edge as a table line, a table line being difficult to distinguish from the background due to strong light shooting, cells being separated by color lines or color blocks in a complex texture sample, adjacent cells being separated by two parallel lines, and a missing table line being very short in a low dense cell.
Further, in step S63, the general sample synthesis tool generates a basic image according to the parameters of the texture portion of the basic background, generates a table structure according to the parameters of the table structure portion, generates text content and a style according to the parameters of the text content and the style portion, generates a frame line and a style according to the parameters of the table line position and the style portion, superimposes the stamp watermark according to the parameters of the stamp watermark synthesis portion, and finally synthesizes the image, the table structure, the text, the table line, and the stamp watermark of each portion into a picture with a label.
The application also provides a device for detecting the table lines in the image, which comprises a semantic segmentation unit, a line segment fitting unit, a table line filtering unit, a table line grouping unit, an electronic table structuring unit and a retraining unit. The semantic segmentation unit is used for obtaining a pixel set of a potential table line adjacent area in an input image by adopting a semantic segmentation network. The line fitting unit is used for performing line fitting on the pixel set in the area adjacent to the table line to obtain the table line. The table line filtering unit is used for filtering the table lines according to character line information obtained by carrying out optical character recognition on the image, removing false table lines and obtaining real table lines. The table line grouping unit is used for respectively grouping all the table lines into groups of rows and columns according to the position relation among the table lines. And the electronic form structuring unit is used for constructing the cells according to the groups to which the form lines belong, and storing the optical character recognition result in each cell range as the character information in the cell to finally obtain the complete structured electronic form. The retraining unit is used for extracting the typical features of a failure scene when the electronic form structuring unit fails to perform electronic form structuring and is caused by error detection of form lines, generating a difficult sample according to the typical features, and retraining the semantic segmentation network; and sending the retrained semantic segmentation network into the semantic segmentation unit, and repeatedly executing the semantic segmentation unit, the line fitting unit, the table line filtering unit, the table line grouping unit and the electronic table structuring unit until the electronic table structuring unit successfully executes electronic table structuring. The device improves the accuracy of table line detection by repeatedly training the semantic segmentation network, and is beneficial to improving the success rate of electronic table structuring.
The technical effect that this application obtained is: the table line is obtained by adopting a mode of combining a semantic segmentation network and line fitting, so that the problems of false lines and missing lines in table line detection are effectively reduced; the method aims at the table line detection of difficult scenes such as word line pressing, repeated word false lines, stamp shielding, light lines, colored lines, color blocks, dotted lines, double line separation, ultra-short lines and the like, and generates difficult samples to train a semantic segmentation network repeatedly by extracting the typical characteristics of a failed scene, so that the accuracy of the table line detection is improved.
Drawings
Fig. 1 is a schematic flowchart of a method for detecting a table line in an image according to the present application.
Fig. 2 is a schematic view of a sub-flow of step S60 in fig. 1.
Fig. 3 is a schematic structural diagram of an apparatus for detecting a table line in an image according to the present application.
The reference numbers in the figures illustrate: the method comprises the following steps of 10, 20, 30, 40, 50 and 60, wherein the steps are a semantic segmentation unit, a line segment fitting unit, a table line filtering unit, a table line grouping unit, a spreadsheet structuring unit and a retraining unit.
Detailed Description
Referring to fig. 1, the method for detecting a table line in an image according to the present application includes the following steps.
Step S10: the image is input into a Semantic Segmentation (Semantic Segmentation) network to obtain a set of pixels in the vicinity of potential table lines, namely isolated pixel points in some areas where table lines may exist. The semantic segmentation of an image is to classify each pixel point in the image, determine the category of each point, and thus perform region division, which is a prior art. Common semantic segmentation networks are based on deep learning algorithms, such as Convolutional Neural Networks (CNN), deep convolutional neural networks (dtn), Full Convolutional Networks (FCN), and the like. The step can effectively remove non-table lines in the image, remove character or background stripe interference and effectively reduce the problems of false lines and missing lines in table line detection.
Step S20: and performing line segment fitting on the pixel set in the area adjacent to the table line to obtain the table line, namely connecting the isolated pixel points predicted in the previous step into a line segment by adopting a traditional line segment fitting method.
Step S30: the table lines obtained in step S20 are filtered according to the character line information obtained by performing Optical Character Recognition (OCR) on the image, and the false table lines are removed to obtain clean real table lines. The character row information includes the height of the character row, the width of a single character, the angle of the character row, and the like.
For example, some character strokes are long, or adjacent character strokes are connected together, and may be detected as a table line in step S20, but belong to a false table line, which can be filtered out according to the height of the character line and the width of the individual character. For another example, when the length of a certain vertical table line detected in step S20 is smaller than the height of the character row, the certain vertical table line is determined to be a dummy table line. For another example, if the angle of the character line is considered to be horizontal, the vertical line is determined; if a certain table line detected in step S20 is out of the allowable angle range of the horizontal line and out of the allowable angle range of the vertical line, it is determined that the certain table line is a false table line. The allowable angle range of the horizontal line is, for example, plus or minus 15 degrees of the horizontal line. The allowable angle range of the vertical line is, for example, plus or minus 15 degrees of the vertical line.
Step S40: and according to the position relation among the table lines, respectively classifying all the table lines into groups of each row and each column. There is an inevitable case where the same table line is detected as a plurality of table lines due to factors such as poor image quality. Meanwhile, the table has the condition that the table lines belonging to the same row and the same column are divided into a plurality of table lines for format requirement. The step is to classify the horizontal lines into different row groups according to the position relation among the horizontal lines in the table lines in order to accurately restore the rows and columns of the cells; and classifying the vertical lines into different column groups according to the position relation among the vertical lines in the table lines.
For example, the horizontal line and the vertical line are distinguished by calculating the angle of the form line. For the horizontal lines, the horizontal lines are sorted according to the starting end points and then are circularly processed, merging and de-duplication are carried out when the horizontal lines which are close in vertical distance and have overlapped horizontal parts are encountered, so that the horizontal lines which logically belong to the same horizontal line but are actually detected to be a plurality of horizontal lines can be assembled into one horizontal line, and the processing process can be accelerated by using a Union-Find (Union-Find) algorithm. Finally, horizontal lines of each table row are grouped into a group, and one or more horizontal lines are contained in the group according to whether a cell is merged or not. A similar approach is used for the processing of vertical lines.
Step S50: and constructing cells according to the group to which the table lines belong, and storing the optical character recognition result in each cell range as character information in the cell to finally obtain the complete structured spreadsheet. This allows the layout of the spreadsheet to be consistent with the layout of the table in the original image.
Step S60: if the electronic form structuring in the step S50 fails and is caused by a form line detection error, extracting the typical features of the failure scene, generating a difficult sample according to the typical features, retraining the semantic segmentation network, and repeating the steps S10 to S50 by using the retrained semantic segmentation network until the electronic form structuring in the step S50 succeeds.
Referring to fig. 2, the step S60 further includes the following sub-steps.
Step S61: a general sample synthesis tool is prepared which can control the presence, size, position, style, etc. of the graphic elements in the generated sample by parameters. Therefore, when the sample is generated, the sample and the label of the corresponding characteristic can be generated only by adjusting the parameters according to the expected sample characteristic, and the data collection and data label process with higher cost is avoided.
By way of example, the difficult sample synthesis tool abstracts the sample generation process into five parts, namely, basic background texture, table structure, text content and style, table line position and style, and stamp watermarking synthesis, and various samples can be generated by flexibly configuring parameters of the parts. The parameters of the table structure part comprise the number, the size, the position, the row and column number, the merging cell condition and the like of the tables. The parameters of the text content and the style part comprise the font size, the font style, the color, the position, the alignment mode and the like. The parameters of the form line position and style section include the type style, thickness, pixel area, etc. of the form line. The parameters of the stamp watermark composition part include the number, position, angle, color, etc. of the stamp watermarks.
Step S62: typical features in the scenario of a spreadsheet structuring failure due to a form line detection error are collected and analyzed. Typical features of the failure scene include, for example, a character line caused by printing misalignment or handwriting, a false line caused by a long-stroke Chinese character longitudinal repeated arrangement, a missing line caused by stamp shielding, mistakenly identifying a stamp edge as a table line, making it difficult to distinguish the table line from the background due to strong light shooting, separating cells in a complex texture sample by color lines or color blocks, separating adjacent cells by two parallel lines, identifying a missing table line in a short dense cell, and the like.
Step S63: according to the typical characteristics of the failure scenario obtained in step S62, parameters in the generic sample synthesis tool are adjusted to generate difficult samples and labels with the same characteristics. The universal sample synthesis tool generates difficult samples and labels the generated difficult samples. Data annotation refers to the act of processing the learning data of an artificial intelligence algorithm by a data processing personnel marking tool.
As an example, the general sample synthesis tool generates a basic image according to parameters of a basic background texture part, generates a table structure according to parameters of a table structure part, generates text content and a style according to parameters of a text content and a style part, generates a frame line and a style according to parameters of a table line position and a style part, superimposes a stamp watermark according to parameters of a stamp watermark synthesis part, and finally synthesizes an image, a table structure, a text, a table line, a stamp watermark and the like of each part into a picture, wherein the picture has labels of the contents of the table structure, the table line and the like.
Step S64: retraining the semantic segmentation network for obtaining a set of pixels in the vicinity of a potential form line in the image with the generated difficult samples. The retrained semantic segmentation network is used for repeating the steps S10 to S50, and can bring more accurate segment fitting results, so that the success rate of structuring the whole spreadsheet is improved.
Referring to fig. 3, the apparatus for detecting table lines in an image according to the present application includes a semantic segmentation unit 10, a line fitting unit 20, a table line filtering unit 30, a table line grouping unit 40, an electronic table structuring unit 50, and a retraining unit 60.
The semantic segmentation unit 10 is configured to obtain a set of pixels in a region near a potential table line in an input image by using a semantic segmentation network, where the set of pixels is isolated pixels in a region where the table line may exist.
The line segment fitting unit 20 is configured to perform line segment fitting on the pixel set in the vicinity of the table line to obtain the table line, that is, the traditional line segment fitting method is adopted to connect the isolated pixel points predicted in the previous step into a line segment.
The form line filtering unit 30 is configured to filter the form lines according to the character line information obtained by performing optical character recognition on the image, and remove the false form lines to obtain real form lines.
The table line grouping unit 40 is configured to group all table lines into groups of rows and columns according to the position relationship between the table lines.
The spreadsheet structuring unit 50 is configured to construct cells according to the groups to which the table lines belong, and store the optical character recognition result in each cell range as the text information in the cell, so as to finally obtain a complete structured spreadsheet.
The retraining unit 60 is configured to, when the electronic form structuring unit 50 fails to perform electronic form structuring and is caused by a table line detection error, extract a characteristic feature of the failed scene, generate a difficult sample based on the characteristic feature, and retrain the semantic segmentation network. The retrained semantic segmentation network is sent to the semantic segmentation unit 10, and is repeatedly executed by the semantic segmentation unit 10, the line segment fitting unit 20, the table line filtering unit 30, the table line grouping unit 40 and the electronic table structuring unit 50 until the electronic table structuring unit 50 successfully executes electronic table structuring.
The method and the device for detecting the table lines in the image adopt a method of combining data driving (namely training and using a semantic segmentation network first, and generating a difficult sample according to a failure scene for retraining and using) and line fitting, and have strong robustness.
The above are merely preferred embodiments of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. A method for detecting a table line in an image, comprising the steps of;
step S10: inputting the image into a semantic segmentation network to obtain a pixel set of a region adjacent to a potential table line; the potential table line neighborhood pixel set refers to isolated pixel points of some regions where table lines may exist;
step S20: performing line segment fitting on the pixel set in the area adjacent to the table line to obtain the table line;
step S30: filtering the table lines obtained in the step S20 according to character line information obtained by carrying out optical character recognition on the image, and removing false table lines to obtain real table lines;
step S40: according to the position relation among the table lines, respectively putting all the table lines into the groups of each row and each column;
step S50: constructing cells according to the groups to which the table lines belong, and storing the optical character recognition result in the range of each cell as character information in the cell to finally obtain a complete structured spreadsheet;
step S60: if the electronic form structuring in the step S50 fails and is caused by a form line detection error, extracting the typical features of the failure scene, generating a difficult sample according to the typical features, retraining the semantic segmentation network, and repeating the steps S10 to S50 by using the retrained semantic segmentation network until the electronic form structuring in the step S50 succeeds.
2. The method for detecting table lines in an image as claimed in claim 1, wherein in the step S10, the semantic segmentation of the image is to classify each pixel point in the image, determine the category of each point, and thereby perform region division; the semantic segmentation network is based on a deep learning algorithm and comprises one or more of a convolutional neural network, a deep convolutional neural network and a full convolutional network.
3. A method for detecting form lines in an image according to claim 1, wherein in step S30, the text line information includes any one or more of a height of a text line, a width of a single text, and an angle of a text line.
4. The method of claim 1, wherein in step S40, the horizontal lines are processed in a loop after being sorted according to the starting end points, and when the horizontal lines with close vertical distances and overlapping horizontal parts are encountered, the horizontal lines are merged and deduplicated, so that the horizontal lines which logically belong to the same horizontal line but are actually detected as a plurality of horizontal lines are assembled into one horizontal line; finally, the horizontal lines of each table row are grouped into a group, and one or more horizontal lines are contained in the group according to the condition that whether the cells are combined or not; a similar approach is used for the processing of vertical lines.
5. The method of claim 4, wherein in step S40, the process is accelerated by using a union-check set algorithm.
6. A method for detecting a form line in an image according to claim 1, wherein said step S60 further comprises the substeps of;
step S61: preparing a general sample synthesis tool having a plurality of adjustable parameters by which samples and labels of various features can be generated;
step S62: collecting and analyzing typical characteristics under the scene of spreadsheet structuralization failure caused by wrong detection of the spreadsheet lines;
step S63: according to the typical characteristics of the failure scene obtained in the step S62, adjusting parameters in the general sample synthesis tool to generate difficult samples and labels with the same characteristics;
step S64: retraining the semantic segmentation network for obtaining a set of pixels in the vicinity of a potential form line in the image with the generated difficult samples.
7. The method of claim 6, wherein in step S61, the difficult sample synthesis tool abstracts the sample generation process into five parts, namely, basic background texture, table structure, text content and style, table line position and style, and seal watermark synthesis; the parameters of the basic background texture part comprise any one or more of background pictures, background colors, texture patterns and texture colors; the parameters of the table structure part comprise any one or more of the number, the size, the position, the row and column number and the condition of merging cells of the table; the parameters of the text content and the style part comprise any one or more of a font size, a font style, a color, a position and an alignment mode; the parameters of the form line position and style part comprise any one or more of type style, thickness and pixel area of the form line; the parameters of the stamp watermark composition portion include any one or more of the number, position, angle, color of the stamp watermark.
8. The method of claim 6, wherein in step S62, the characteristic features of the failure scene include any one or more of a character line caused by printing misalignment or handwriting, a false line caused by long stroke Chinese character longitudinal repeat arrangement, a missing line caused by stamp blocking, mistakenly recognizing a stamp edge as a table line, a strong light shot causing a table line hard to distinguish from the background, a cell separated by a color line or a color block in a complex texture sample, a neighboring cell separated by two parallel lines, and a missing table line recognized by a very short table line in a low dense cell.
9. The method of claim 7, wherein in step S63, the general sample composition tool generates a base image according to parameters of a texture portion of a base background, generates a table structure according to parameters of a table structure portion, generates text content and a style according to parameters of a text content and a style portion, generates a frame line and a style according to parameters of a position and a style portion of a table line, superimposes a stamp watermark according to parameters of a stamp watermark composition portion, and finally combines the image, the table structure, the text, the table line, and the stamp watermark of each portion into a picture with a label.
10. A device for detecting table lines in an image is characterized by comprising a semantic segmentation unit, a line segment fitting unit, a table line filtering unit, a table line grouping unit, an electronic table structuring unit and a retraining unit;
the semantic segmentation unit is used for acquiring a pixel set of a potential table line adjacent region in an input image by adopting a semantic segmentation network;
the line fitting unit is used for performing line fitting on the pixel set in the area adjacent to the table line to obtain the table line;
the table line filtering unit is used for filtering the table lines according to character line information obtained by carrying out optical character recognition on the image, removing false table lines and obtaining real table lines;
the table line grouping unit is used for respectively grouping all the table lines into groups of rows and columns according to the position relation among the table lines;
the electronic form structuring unit is used for constructing cells according to the groups to which the form lines belong, and storing the optical character recognition result in each cell range as the character information in the cell to finally obtain a complete structured electronic form;
the retraining unit is used for extracting the typical features of a failure scene when the electronic form structuring unit fails to perform electronic form structuring and is caused by error detection of form lines, generating a difficult sample according to the typical features, and retraining the semantic segmentation network; and sending the retrained semantic segmentation network into the semantic segmentation unit, and repeatedly executing the semantic segmentation unit, the line fitting unit, the table line filtering unit, the table line grouping unit and the electronic table structuring unit until the electronic table structuring unit successfully executes electronic table structuring.
CN202111134050.5A 2021-09-27 2021-09-27 Method and device for detecting table line in image Pending CN113723362A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111134050.5A CN113723362A (en) 2021-09-27 2021-09-27 Method and device for detecting table line in image
PCT/CN2022/085400 WO2023045298A1 (en) 2021-09-27 2022-04-06 Method and apparatus for detecting table lines in image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111134050.5A CN113723362A (en) 2021-09-27 2021-09-27 Method and device for detecting table line in image

Publications (1)

Publication Number Publication Date
CN113723362A true CN113723362A (en) 2021-11-30

Family

ID=78685034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111134050.5A Pending CN113723362A (en) 2021-09-27 2021-09-27 Method and device for detecting table line in image

Country Status (2)

Country Link
CN (1) CN113723362A (en)
WO (1) WO2023045298A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114255346A (en) * 2021-12-29 2022-03-29 科大讯飞股份有限公司 Form image processing method, related device and readable storage medium
CN114782968A (en) * 2022-03-31 2022-07-22 上海云从企业发展有限公司 Form identification method, device and electronic equipment
WO2023045298A1 (en) * 2021-09-27 2023-03-30 上海合合信息科技股份有限公司 Method and apparatus for detecting table lines in image

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912863A (en) * 2022-12-07 2023-10-20 中国移动通信有限公司研究院 Form identification method and device and related equipment
CN116386071A (en) * 2023-04-18 2023-07-04 湖南星汉数智科技有限公司 Image table structure recognition method, device, computer equipment and storage medium
CN116311310A (en) * 2023-05-19 2023-06-23 之江实验室 Universal form identification method and device combining semantic segmentation and sequence prediction
CN117475459B (en) * 2023-12-28 2024-04-09 杭州恒生聚源信息技术有限公司 Table information processing method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101676930A (en) * 2008-09-17 2010-03-24 北大方正集团有限公司 Method and device for recognizing table cells in scanned image
CN107943956A (en) * 2017-11-24 2018-04-20 北京金堤科技有限公司 Conversion of page method, apparatus and conversion of page equipment
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
US20190294661A1 (en) * 2018-03-21 2019-09-26 Adobe Inc. Performing semantic segmentation of form images using deep learning
CN110569489A (en) * 2018-06-05 2019-12-13 北京国双科技有限公司 Form data analysis method and device based on PDF file
CN110796031A (en) * 2019-10-11 2020-02-14 腾讯科技(深圳)有限公司 Table identification method and device based on artificial intelligence and electronic equipment
CN112396047A (en) * 2020-10-30 2021-02-23 北京文思海辉金信软件有限公司 Training sample generation method and device, computer equipment and storage medium
CN112528863A (en) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 Identification method and device of table structure, electronic equipment and storage medium
CN113221743A (en) * 2021-05-12 2021-08-06 北京百度网讯科技有限公司 Table analysis method and device, electronic equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11200413B2 (en) * 2018-07-31 2021-12-14 International Business Machines Corporation Table recognition in portable document format documents
CN110363095B (en) * 2019-06-20 2023-07-04 华南农业大学 Identification method for form fonts
CN111860502B (en) * 2020-07-15 2024-07-16 北京思图场景数据科技服务有限公司 Picture form identification method and device, electronic equipment and storage medium
CN112507876B (en) * 2020-12-07 2024-10-15 数地工场(南京)科技有限公司 Wired form picture analysis method and device based on semantic segmentation
CN113723362A (en) * 2021-09-27 2021-11-30 上海合合信息科技股份有限公司 Method and device for detecting table line in image

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101676930A (en) * 2008-09-17 2010-03-24 北大方正集团有限公司 Method and device for recognizing table cells in scanned image
CN107943956A (en) * 2017-11-24 2018-04-20 北京金堤科技有限公司 Conversion of page method, apparatus and conversion of page equipment
US20190294661A1 (en) * 2018-03-21 2019-09-26 Adobe Inc. Performing semantic segmentation of form images using deep learning
CN110569489A (en) * 2018-06-05 2019-12-13 北京国双科技有限公司 Form data analysis method and device based on PDF file
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
CN110796031A (en) * 2019-10-11 2020-02-14 腾讯科技(深圳)有限公司 Table identification method and device based on artificial intelligence and electronic equipment
CN112396047A (en) * 2020-10-30 2021-02-23 北京文思海辉金信软件有限公司 Training sample generation method and device, computer equipment and storage medium
CN112528863A (en) * 2020-12-14 2021-03-19 中国平安人寿保险股份有限公司 Identification method and device of table structure, electronic equipment and storage medium
CN113221743A (en) * 2021-05-12 2021-08-06 北京百度网讯科技有限公司 Table analysis method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
唐皓瑾: "一种面向PDF文件的表格数据抽取方法的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 08, pages 138 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023045298A1 (en) * 2021-09-27 2023-03-30 上海合合信息科技股份有限公司 Method and apparatus for detecting table lines in image
CN114255346A (en) * 2021-12-29 2022-03-29 科大讯飞股份有限公司 Form image processing method, related device and readable storage medium
CN114782968A (en) * 2022-03-31 2022-07-22 上海云从企业发展有限公司 Form identification method, device and electronic equipment

Also Published As

Publication number Publication date
WO2023045298A1 (en) 2023-03-30

Similar Documents

Publication Publication Date Title
CN113723362A (en) Method and device for detecting table line in image
CN102332096B (en) Video caption text extraction and identification method
CA2113751C (en) Method for image segmentation and classification of image elements for document processing
Alberti et al. Labeling, cutting, grouping: an efficient text line segmentation method for medieval manuscripts
Dongre et al. Devnagari document segmentation using histogram approach
CN110363095A (en) A kind of recognition methods for table font
CN101777124A (en) Method for extracting video text message and device thereof
CN105260751B (en) A kind of character recognition method and its system
CN113688795A (en) Method and device for converting table in image into electronic table
CN108052955B (en) High-precision Braille identification method and system
WO2009070032A1 (en) A method for processing optical character recognition (ocr) data, wherein the output comprises visually impaired character images
Harit et al. Table detection in document images using header and trailer patterns
CN101162506A (en) Seal imprint image search method of circular stamp
Kaundilya et al. Automated text extraction from images using OCR system
Bijalwan et al. Automatic text recognition in natural scene and its translation into user defined language
CN116824608A (en) Answer sheet layout analysis method based on target detection technology
KR101937398B1 (en) System and method for extracting character in image data of old document
RU2436156C1 (en) Method of resolving conflicting output data from optical character recognition system (ocr), where output data include more than one character image recognition alternative
CN115661183A (en) Intelligent scanning management system and method based on edge calculation
CN106503706A (en) The method of discrimination of Chinese character pattern cutting result correctness
Chen et al. Model-based tabular structure detection and recognition in noisy handwritten documents
JP4492258B2 (en) Character and figure recognition and inspection methods
JP4867894B2 (en) Image recognition apparatus, image recognition method, and program
Barna et al. Segmentation of heterogeneous documents into homogeneous components using morphological operations
CN107886808B (en) Braille square auxiliary labeling method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination