CN113807071A

CN113807071A - OCR-based document generation method

Info

Publication number: CN113807071A
Application number: CN202111011260.5A
Authority: CN
Inventors: 李梦茹; 金红达; 何琦枫; 孙建彬; 谢建勋; 姜雪明
Original assignee: Zhejiang Supcon Information Technology Co ltd
Current assignee: Zhejiang Supcon Information Technology Co ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2021-12-17

Abstract

The invention discloses a document generation method based on OCR, which comprises the following steps: importing a template, converting the content of the template into an HTML element, and displaying the HTML element on a preview page; the method comprises the steps of operating on an input page, enabling a title grading component to be used for determining a title style, enabling a title component to be used for inputting title contents, enabling a file component to be used for inputting file contents, enabling a file identification component to be used for identifying a file dragged by a mouse, enabling the file uploading component to select an uploaded file, enabling the file identification component to convert the file in the file identification component into character contents, and clicking a determining component to display the contents of the input page on a preview page; an export component exports the preview page content as a text file. The method solves the redundant steps in the common document compiling work, omits the steps of page switching, copying and pasting and page composing, simultaneously supports the functions of picture, voice and video reference, picture character recognition, voice character recognition and video character recognition, provides more convenient operation for operators, and improves the document compiling efficiency.

Description

OCR-based document generation method

Technical Field

The invention relates to the field of application program methods, in particular to a document generation method based on OCR.

Background

Conventional operation manuals include: introduction, software overview, software use description, operation command list, user operation example, and the like. The method is usually written by related developers, a certain specification is required to be referred in the writing process, a fixed template is usually provided, and meanwhile, a graphic description is required to be provided. Therefore, the writing personnel is required to frequently capture the pictures, and the system operation and the document writing work are carried out simultaneously, so that the operation is troublesome.

The document generation method in the prior art has a plurality of problems: the method has the advantages of frequent page switching, frequent copying and pasting, complex page typesetting, no support for picture, voice and video reference, no support for picture character recognition, voice character recognition, video character recognition and the like.

For example, a "document creation method and a related apparatus" disclosed in chinese patent literature, the publication number of which: CN110008461A, filing date thereof: in 2019, 4, 16, the problems of long time consumption and low efficiency in the prior art caused by the fact that a document template is constructed and modified through programming after developers and service personnel need to communicate are solved, but the problems of complex operation, troublesome typesetting, no support for picture, voice and video reference, no support for picture character recognition, voice character recognition, video character recognition and the like exist.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a document generation method based on OCR, which avoids the problems of frequent page switching, frequent copying and pasting, complex page typesetting, no support for picture, voice and video citation, no support for picture character recognition, voice character recognition and video character recognition and the like.

The technical scheme of the invention is as follows.

An OCR-based document generation method comprising the steps of:

s1: importing a template, converting the content of the template into an HTML element, and displaying the HTML element on a preview page;

s2: operating on an input page, wherein a title grading component is used for determining a title style, a title component is used for inputting title content, a document component is used for inputting document content, a file identification component is used for identifying a file dragged by a mouse and uploading the file to the file identification component, the file uploading component selects an uploading file and uploads the file to the file identification component, the file identification component converts the file in the file identification component into text content and displays the text content on the preview page, and the input page content of the determination component is clicked and displayed on the preview page;

s3: an export component exports the preview page content as a text file. The method solves the redundant steps in the common document compiling work, omits the steps of page switching, copying and pasting and page composing, simultaneously supports the functions of picture, voice and video reference, picture character recognition, voice character recognition and video character recognition, provides more convenient operation for operators, and improves the document compiling efficiency.

Preferably, the file uploading component uploads a picture file, a voice file or a video file. The same component supports uploading of different types of files, and convenience is provided for users.

Preferably, the file recognition component directly recognizes a picture file, a voice file or a video file dragged to the region. When the computer is used at a computer end, the file is dragged and uploaded, and convenience is provided for a user.

Preferably, the text recognition component recognizes and processes the document into text content, and the text content is saved to the preview page. After the file is identified into the text content, the text content is displayed on the preview page, so that the content and the creation content can be modified secondarily, and the omnibearing document service is provided for a user.

Preferably, the preview page stores the determination of the content of the component at each input as a separate tag. The accurate sequencing of the text content of being convenient for discern, the sectional type experience effect is better, facilitates for the secondary creation.

Preferably, the independent tag is split into two independent sub-tags after double-click. The specified content is accurately created, so that the preview copying is more convenient.

Preferably, the individual tags are ordered by incrementing. The input or recognized contents are sequenced, and the problem of disordered sequencing of the input contents and the recognized contents is avoided. The user experience is improved.

Preferably, the document generation method is a desktop application with a front-end page based on an vue framework, a back-end page programmed by java and packaged by electron. The method generated by the method has the advantages of small storage space, strong interactivity, cross-platform performance and high safety.

Preferably, the template is a built-in template of the document generation method or a word file containing content. The built-in template meets the typesetting requirements of partial documents, supports the user-defined uploading template, meets the user requirements and improves the user experience.

The invention has the beneficial effects that: the method solves the redundant steps in the common document compiling work, omits the steps of page switching, copying and pasting and page composing, simultaneously supports the functions of picture, voice and video reference, picture character recognition, voice character recognition and video character recognition, provides more convenient operation for operators, and improves the document compiling efficiency.

Drawings

FIG. 1 is a flowchart of a document generation method based on OCR.

Detailed Description

The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings. In addition, numerous specific details are set forth below in order to provide a better understanding of the present invention. It will be understood by those skilled in the art that the present invention may be practiced without some of these specific details. In some instances, methods, instrumentalities well known to those skilled in the art have not been described in detail in order to not unnecessarily obscure the present invention.

Example (b): an OCR-based document generation method comprising the steps of:

s2: the method comprises the steps of operating on an input page, enabling a title grading component to be used for determining a title style, enabling a title component to be used for inputting title contents, enabling a file component to be used for inputting file contents, enabling a file identification component to be used for identifying a file dragged by a mouse and uploading the file to the file identification component, enabling the file uploading component to select an uploading file and upload the uploading file to the file identification component, enabling the file identification component to convert the file in the file identification component into character contents and display the character contents on a preview page, and clicking the input page contents of the determination component to display the preview page;

s3: the export component exports the preview page content as a text file.

In this embodiment, the file upload component uploads a picture file, a voice file, or a video file. And after the file uploading component responds, opening a file selection popup window, selecting a target file through a target file path, clicking the popup window to determine, and uploading the file to the file identification component.

In this embodiment, the file identification component directly identifies the picture file, the voice file, or the video file dragged to the region. And clicking the selected target file by a mouse, dragging the target file to the file identification component, and then placing the target file, wherein the target file is successfully uploaded.

In this embodiment, the text recognition component recognizes the document and processes the document into text content, and the text content is saved to the preview page.

And after the character recognition component responds, the front end judges the file, if the file is an empty file, a prompt box that the uploaded file is an empty file is popped up on an input page, and if the file is not an empty file, the specific type of the file is judged. And when the file type is the picture type, the voice type or the video type, transmitting the file information to a rear end for processing, and simultaneously transmitting the maximum independent tag serial number to the rear end, if the file type is not the picture type, the voice type or the video type, inputting a page, popping up a prompt box which is not suitable for uploading the file, and reselecting the file for uploading.

And when the file type received by the back end is a picture file, calling a picture character conversion method by the program for analysis and displaying on a preview page. The specific process comprises the following steps: the back end obtains a corresponding picture file, calls a picture conversion character interface, ignores the style of characters, analyzes the picture content, temporarily stores the analyzed characters in a character array form, takes the independent tag serial number and each segment of the character content as the elements of the array, adds one to the first independent tag serial number which is the maximum independent tag serial number to be transmitted, and sequentially increases the number afterwards. And after the picture characters are analyzed, transmitting the character array to the front end, sequentially analyzing array elements according to the ascending sequence number of the independent tag, analyzing the array elements into independent tag patterns, and displaying the character content of the elements on a preview page.

When the file type received by the back end is a voice file, the program calls a method for converting the voice into the characters to analyze and display the characters on a preview page. The specific process comprises the following steps: the back end obtains a corresponding voice file, calls a voice conversion text interface, analyzes the voice content, when the current and the later voices are cut off for more than one second, the voice content is regarded as a section of voice content, the analyzed text is temporarily stored in a character array form, the independent tag serial number and each section of text content are taken as elements of the array, the first independent tag serial number is the maximum independent tag serial number to be transmitted, and is increased by one, and the sequence is sequentially increased. And after the voice text is analyzed, transmitting the character array to the front end, sequentially analyzing array elements according to the incremental sequence numbers of the independent tags, analyzing the array elements into independent tag patterns, and displaying the text content of the elements on a preview page.

And when the file type received by the back end is a video file, the program calls a method for converting the video into the characters to analyze and display the characters on a preview page. The specific process comprises the following steps: the method comprises the steps that a corresponding video file is obtained at the rear end, audio extraction processing is carried out on the video, a voice conversion character interface is called, the processed audio is analyzed, when the current audio and the later audio are cut off for more than one second, a section of audio content is regarded as, analyzed characters are temporarily stored in a character array mode, an independent tag serial number and each section of character content serve as elements of the array, the first independent tag serial number is the maximum independent tag serial number which is transmitted plus one, and the sequence is sequentially increased. And after the video characters are analyzed, transmitting the character array to the front end, sequentially analyzing array elements according to the ascending sequence number of the independent tag, analyzing the array elements into independent tag patterns, and displaying the character content of the elements on a preview page.

In this embodiment, the determination of the content of the component at each input end of the preview page is stored as an independent tag. The independent tags are stored in a character array form, and the character array elements comprise independent tag serial numbers and literal contents.

In this embodiment, after the independent tag is double-clicked, the independent tag is split into two independent sub-tags. When a specific independent label is split, a split node is determined, and the independent label is split into two sub-labels. The character array simultaneously splits the element corresponding to the independent tag into two sub-elements, wherein the serial number of the independent tag of the first sub-element is the same as that of the element, the text content takes the text before the split node, the serial number of the independent tag of the second sub-element is the serial number of the element plus one, and the text content takes the text after the split node. And adding one to the element independent label serial numbers behind the split elements in the character array, and adding the sub-elements into the character array for increasing and sequencing to form a new character array.

In this embodiment, the individual tags are sorted by incrementing. The independent tags correspond to elements of the character array, and the elements are sorted in an increasing mode.

In this embodiment, the document generation method is a desktop application in which a front-end page is based on an vue framework, a back-end adopts java programming, and the desktop application is packaged by an electron.

In this embodiment, the template is a built-in template of the document generation method or a word file containing content.

Specific embodiments are as follows.

And opening the program, selecting or uploading the file template, and performing the next page input operation. And supporting word and PDF file types.

The method comprises the steps that sequentially executable operations of an input page are that a title grading component is selected through pulling down, a title style is determined, title content is input through the title component, document content is input through the document component, a file dragged to the component is identified through a file identification component, the file is uploaded to the file identification component through a file uploading component, the file in the file identification component is converted into text content through the text identification component, the determination component is clicked, and the input page content is stored as an independent tag and displayed on a preview page.

The files are uploaded to a file identification component for multiple times, and the component stores the last uploaded file.

When the input file content and the character recognition content are input at the same time, the input file content and the character recognition content are respectively stored as independent labels, and the serial number of the independent label of the input file content is before the serial number of the independent label of the character recognition content.

And when the content of the preview page is modified, selecting the independent page tag for operation. And clicking the independent tag, rewriting the content of the preview page into a component of the input page, modifying and determining the content of the component of the input page, and displaying the modified content to the preview page. And double-clicking the independent tag, splitting the independent tag into two independent sub-tags from a double-clicking node, clicking the sub-tags, rewriting the content of the preview page into a component of the input page, modifying and determining the content of the component of the input page, and displaying the modified content to the preview page.

When the effective file exists in the file identification component and the function of the character identification component is not executed, the determined preview page stores the absolute path of the file identification component.

And selecting the independent tag and clicking a deleting component, and previewing the page to delete the corresponding independent tag.

And clicking the export component to export the preview page content into a file. And exporting word and PDF files.

Clicking all deletion components, and emptying the content of the preview page.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An OCR-based document generation method comprising the steps of:

s3: an export component exports the preview page content as a text file.

2. An OCR-based document generation method as recited in claim 1 wherein the file upload component uploads a picture file, a voice file or a video file.

3. An OCR-based document generation method as recited in claim 1 wherein the file recognition component directly recognizes a picture file, a voice file or a video file dragged to the region.

4. An OCR based document generation method as recited in claim 1 wherein said text recognition component recognizes and processes a document into text content, said text content being saved to said preview page.

5. An OCR based document generation method as recited in claim 1 wherein the preview page saves the determination of component content each time the input is entered as a separate tag.

6. An OCR-based document generation method according to claim 1 or 5, wherein the independent tag is split into two independent sub-tags after double-clicking.

7. An OCR-based document generation method according to claim 1 or claim 5, wherein the individual tags are ordered by increment.

8. An OCR-based document generation method as claimed in claim 1, wherein the document generation method is an electronic packaged desktop application with front-end pages based on an vue framework and back-end using java programming.

9. An OCR-based document generation method according to claim 1, wherein the template is a built-in template of the document generation method or a word file containing content.