[go: up one dir, main page]

CN110969161B - Image processing method, circuit, vision-impaired assisting device, electronic device, and medium - Google Patents

Image processing method, circuit, vision-impaired assisting device, electronic device, and medium Download PDF

Info

Publication number
CN110969161B
CN110969161B CN201911214755.0A CN201911214755A CN110969161B CN 110969161 B CN110969161 B CN 110969161B CN 201911214755 A CN201911214755 A CN 201911214755A CN 110969161 B CN110969161 B CN 110969161B
Authority
CN
China
Prior art keywords
text
line
stored
data
image processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911214755.0A
Other languages
Chinese (zh)
Other versions
CN110969161A (en
Inventor
封宣阳
蔡海蛟
冯歆鹏
周骥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NextVPU Shanghai Co Ltd
Original Assignee
NextVPU Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NextVPU Shanghai Co Ltd filed Critical NextVPU Shanghai Co Ltd
Priority to CN201911214755.0A priority Critical patent/CN110969161B/en
Publication of CN110969161A publication Critical patent/CN110969161A/en
Application granted granted Critical
Publication of CN110969161B publication Critical patent/CN110969161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

Provided are an image processing method, a circuit, a vision-impaired assisting device, an electronic device, and a medium. The image processing method comprises the following steps: acquiring an image, wherein the image comprises a text region; in the text region, carrying out character recognition on a text line to be recognized so as to obtain text data of the text line; and storing the text data of the text line to the text to be read.

Description

Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, a circuit, a vision-impaired assisting device, an electronic device, and a medium.
Background
In recent years, image processing technology has been widely used in various fields, and among them, technology concerning recognition, storage, and reading of text data has been one of the focuses of attention in the industry.
The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.
Disclosure of Invention
According to an aspect of the present disclosure, there is provided an image processing method including: acquiring an image, wherein the image comprises a text region; in the text region, carrying out character recognition on a text line to be recognized so as to obtain text data of the text line; and storing the text data of the text line to the text to be read.
According to another aspect of the present disclosure, there is provided an electronic circuit comprising: circuitry configured to perform the steps of the method described above.
According to another aspect of the present disclosure, there is also provided a vision impairment assisting apparatus, including: a camera configured to acquire an image; the electronic circuit described above; circuitry configured to perform text detection and recognition of text contained in the image to obtain text data; circuitry configured to read the text data.
According to another aspect of the present disclosure, there is also provided an electronic apparatus including: a processor; and a memory storing a program comprising instructions that when executed by the processor cause the processor to perform the method described above.
According to another aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium storing a program comprising instructions which, when executed by a processor of an electronic device, cause the electronic device to perform the above-described method.
Drawings
The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.
Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure;
FIG. 2 illustrates an exemplary image including a text region having a plurality of text lines therein;
fig. 3 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
fig. 4 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
FIG. 5 illustrates a process of identifying and storing a line of text to be identified according to an exemplary embodiment of the present disclosure;
FIG. 6 illustrates a process of identifying and storing a line of text to be identified according to another exemplary embodiment of the present disclosure;
fig. 7 is a flowchart illustrating an image processing method according to another exemplary embodiment of the present disclosure;
Fig. 8 is a flowchart illustrating an image processing method according to still another exemplary embodiment of the present disclosure;
fig. 9 (a), 9 (b), 9 (c), 9 (d) are diagrams illustrating a data storage process according to an exemplary embodiment of the present disclosure;
FIG. 10 is a schematic process showing reading of identified text data according to an exemplary embodiment of the present disclosure; and
fig. 11 is a block diagram illustrating an example of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.
The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.
Although character recognition-related image processing techniques have been widely used in various fields, some challenges still remain in text data reading.
The conventional reading method may be as follows. The first method is to store the recognized text data in units of words in a data structure such as an array or an array. The disadvantage of this approach is that it is cumbersome to read the desired text from the data structure storing the text, and the read text lacks semantic links and context. The second method is to first identify all the words from the image, store them as a whole, and then read the stored words. A disadvantage of this approach is the long read latency required.
The present disclosure provides an image processing method. Fig. 1 is a flowchart illustrating an image processing method according to an exemplary embodiment of the present disclosure.
In this disclosure, a text line refers to a sequence of words with adjacent words having a spacing less than a threshold spacing, i.e., a continuous line of words. The adjacent text pitch refers to a distance between coordinates of corresponding positions of adjacent text, such as a distance between upper left corner coordinates, lower right corner coordinates, or between centroid coordinates of adjacent text, or the like. If the adjacent text spacing is not greater than the threshold spacing, then the adjacent text may be considered continuous, dividing it into the same text line. If the adjacent text spacing is greater than the threshold spacing, then the adjacent text may be considered discontinuous (e.g., may belong to different paragraphs or left and right columns, respectively) and thus divided into different lines of text. The threshold spacing may be set according to the text size, for example: the threshold spacing set by the adjacent characters with the font size larger than the fourth character (such as the third character and the second character) is larger than the threshold spacing set by the adjacent characters with the font size smaller than the fourth character (such as the fourth character and the fifth character).
As shown in fig. 2, the image example includes a text region containing 5 text lines (5 lines of text). Note that one image is not limited to only one text region, but may have a plurality of text regions, and each text region in the image may be processed using the image processing method of the present disclosure.
As shown in fig. 1, an image processing method according to an exemplary embodiment of the present disclosure includes: step S101, acquiring an image, wherein the image comprises a text area; step S102, character recognition is carried out on a text line to be recognized in the text area so as to obtain text data of the text line; and step S103, storing the text data of the text line to the text to be read.
In step S101, an image including a text region is acquired.
The acquired image may include a text region (i.e., a region containing text) with 5 lines of text, as shown in fig. 2. As described above, the image acquired in step S101 may also include a plurality of text regions, and each text region may include a plurality of text lines therein.
According to some embodiments, text regions may be included in the acquired image, and each text region may include at least two lines of text (at least 2 text lines), which may be, for example, various forms of text (including various characters, numbers, etc.). In addition, the image may include a map or the like in addition to the text region.
According to some embodiments, the acquired image may be the image taken directly by the camera or may be an image that has undergone some pre-processing based on the image taken by the camera, which may include, for example, de-noising, contrast enhancement, resolution processing, gray scale processing, blur removal, and so forth.
According to some embodiments, the images captured by the camera may be acquired in real time or at a time after the images are captured by the camera.
According to some embodiments, the acquired image may be a pre-screened image, for example, a plurality of shots, and a clearer image is selected.
According to some embodiments, a camera for capturing images may be capable of still or dynamic image capture, which may be a stand-alone device (e.g., camera, video camera, etc.), or may be included in various types of electronic equipment (e.g., mobile phone, computer, personal digital assistant, vision-impaired auxiliary device, tablet computer, reading auxiliary device, wearable device, etc.).
According to some embodiments, the camera may be provided on a wearable device of the user or on a device such as glasses, for example.
In step S102, character recognition is performed for one text line to be recognized in the text region to obtain text data of the text line.
According to some embodiments, text data for a text line may be obtained by character recognition of the text line by, for example, optical character recognition, OCR, methods.
According to some embodiments, text line detection occurs after the image is acquired and before character recognition.
According to some embodiments, each text line to be identified in a text region may be sequentially detected and identified, resulting in text data for the text line.
Taking the image shown in fig. 2 as an example, for example, character recognition may be performed on the 1 st line of text first, so as to obtain text data of the 1 st line of text ([ "onset", i.e. "on-sight"). And then character recognition is sequentially carried out on the characters in the subsequent rows, so that corresponding text data are obtained.
Note, however, that the detection and recognition need not start from the first line of the text region, but may start directly from other lines.
In step S103, the text data of the text line is stored to the text to be read.
According to some embodiments, each text line may be identified, text data for the identified text line may be stored to the text to be read for reading by the reading device.
According to some embodiments, the text to be read may be actively acquired by the reading device, or may be provided to the reading device by the recognition device for performing text line recognition.
According to some embodiments, after being read, the text to be read becomes read text, and the read text may be further stored, if necessary, such as by storing the read text in order for use. According to some embodiments, the read text may not be stored as appropriate.
Therefore, the invention recognizes the text lines in the text area line by line, and after recognizing a text line, stores the obtained text data of the text line to the text to be read for reading by the reading equipment, thereby realizing the asynchronous processing of recognition and reading and reducing the waiting time of reading.
Moreover, through the steps, the text data of the detected and identified text lines are spliced and stored to be the text to be read, so that the content read from the spliced and stored text data can have normal semantic links and context, the problems of word-to-word blocking and lack of semantic links and context caused by word-by-word reading are overcome, and the sentence breaking problem of each line (hard and mechanical interval or blocking between line-to-line reading) caused by line-by-line reading is also overcome to a great extent. For example, according to an exemplary embodiment of the present disclosure, after the last reading, text data (at least one line) that has been stored but has not been read is acquired, and the text data is read continuously line by line, in which the semantics are joined and coherent. Unlike prior art row-by-row reads, only one row can be read at a time, and there is significant churning, loss of semantic engagement and consistency, during the time from reading the row to reading the next row.
The image processing method according to the exemplary embodiment of the present disclosure may further include step S104, step S105, and step S106, as shown in fig. 3.
In step S104, the text data of the text line is stored to the stored text as one line of data in the stored text.
According to some embodiments, in addition to storing the text data of the identified text line into the text to be read, it may also be stored into the stored text, and when stored, may be stored as a line of data in the stored text. That is, the recognized text data may be stored in a line as well as in a presentation form in the text region of the image. Specifically, for example, text data of 1 text line in the recognized text region is stored as one line data when stored in the stored text, so as to facilitate subsequent processing.
According to some embodiments, the text to be read and the stored text may be stored in different storage spaces. In addition, according to some embodiments, the read text may also be stored in a different storage space than the text to be read and the stored text.
In step S105, the sum of the character numbers of the stored text is calculated as the stored total character number.
The number of characters of a line of text may be used to represent the number of characters of the line of text. In the case that 1 kanji is equal to 2 characters, 1 english letter, 1 number, or 1 punctuation is equal to 1 character, the number of characters of a line of characters can be generally calculated by "kanji number x 2+ english letter number + punctuation number". For example, the text data of line 1 in fig. 2 [ "onset", i.e. "turn on vision". The number of characters in ] is 20, specifically: 7 kanji x 2+6 punctuation = 20 characters. In addition, the number of characters of the text line may also be calculated in other ways, and is not limited to the exemplary manner shown herein.
With the continued recognition of the text lines and the continued storage of text data, the total number of stored characters, which is the sum of the number of characters of the stored text, varies. Thus, the stored total number of characters may be updated with the storage. In this step, the update of the stored total number of characters is emphasized.
The stored total number of characters may be calculated based on the number of characters of the stored text for use in calculating the updated cutoff duty cycle in a subsequent step S106.
In the case where only one text line is recognized and stored, there is only one line of data in the stored text (text data of the text line), the stored total number of characters may be the number of characters of the text data of this text line, for example, if only the 1 st line of text as the first text line in fig. 2 is recognized, the value of the stored total number of characters at this time is 20 in the case where 1 Chinese character is 2 characters and one punctuation mark is 1 character.
In the case where a plurality of text lines are identified and stored, there are a plurality of lines of data in the stored text, each line of data corresponding to the text data of each of the plurality of text lines, respectively, the total number of stored characters may be the sum of the number of characters of the text data of the plurality of text lines. In the case where the 1 st line character to the 5 th line character shown in fig. 2 are recognized and stored to the stored text, the total number of stored characters is the total number of characters from the 1 st line to the 5 th line character, that is, 20+26+30+25+41=142.
According to some embodiments, this step may be combined with step S106 as one step.
In step S106, a cut-off duty ratio of each line of data in the stored text is calculated and stored, wherein the cut-off duty ratio is determined by a ratio of a sum of a number of characters preceding the line of data in the stored text and a number of characters of the line of data to the total number of stored characters.
If the location of an identified line of text in the text region is to be determined, this may be accomplished, for example, by calculating the cut-off duty cycle of the text data corresponding to the line of text (i.e., the data of the corresponding line in the stored text described above).
Whenever text data for a text line is newly identified and stored, the cutoff duty cycle of the text line preceding the text line is changed, so that it can be recalculated to update the cutoff duty cycle of the text lines, and in addition, it is necessary to calculate the cutoff duty cycle of the text line. How the cut-off duty cycle of a text line is calculated will be described in more detail below by combining specific examples.
For example, assuming that text data of 5 text lines has been stored in the stored text, such as from 1 st line text to 5 th line text in the text region shown in fig. 2, the total number of stored characters of the stored text is the total number of characters from 1 st line to 5 th line, that is, 20+26+30+25+41=142.
Wherein for e.g. line 3 text, the cut-off ratio is the ratio of the number of characters of line 1 to line 3 text to the total number of characters stored, i.e. (20+26+30)/142=54%.
The method of obtaining the cutoff ratio using the number of characters is exemplified above for explaining the meaning of the "cutoff ratio", and the cutoff ratio of each line of data may be determined based on other parameters, for example, the space position information of the text line corresponding to the line of data in the text area, for example, the ratio of the area from the first text line to the text line of the text area to the whole text area, the ratio of the line number from the first text line to the text line of the text area to the total line number of the text area, and the like may be used.
According to some embodiments, the cut-off duty cycle of a newly identified and stored text line may be a specific value, such as 100%, which is explicit because it is all stored text lines. However, this special case is only for that text line that is newly identified and stored.
The above describes the identification and storage of text lines in a manner that facilitates multi-line reading, such as by concatenating stored text data. By splice storage, it is meant herein that data obtained discretely (e.g., obtained line by line) is stored together in sequence.
Through the steps S104-S106, the relevant information (such as the cut-off duty ratio and the like) of the text lines is stored along with the storage of the text data in real time, so that the management of the text data is facilitated; and the cut-off duty ratio of the stored text lines is dynamically updated in the storage process to obtain an accurate cut-off duty ratio, so that great convenience is provided for positioning and re-reading text data, and the reading speed and accuracy can be greatly improved, thereby providing required services for users.
According to some embodiments, as shown in fig. 4, at step S110, a location identification of the text line may be stored for representing the location of the text line in the text region.
Wherein the location identifier for representing the location of the text line in the text area may be a line number. For example, the first row may be denoted by "001". The location identity may also be represented in other ways, and is not limited to the way illustrated herein.
Although step S110 shown in fig. 4 is immediately after step S104 of storing text data to stored text, the storing of the location identification may be performed together with the storing of the text data, that is, step S110 may be incorporated into step S104. Alternatively, step S110 of storing the location identifier of the text line may be performed after step S106. In summary, the present disclosure is not limited to the steps and order of execution thereof shown in fig. 4.
Fig. 5 illustrates a character recognition and storage process according to an exemplary embodiment of the present disclosure.
As shown in fig. 5, for example, in the case where one text line to be recognized is a 1 st line character, the following are stored as text data, position identification, and cut-off duty of the 1 st line character, respectively: the "onset", i.e. "turn on vision". [001], [100% ]. Here, the numerical values such as 001, 100% shown in fig. 5 are exemplary, and the present disclosure is not limited to the exemplary numerical forms.
For ease of description, information related to a text line (excluding text data of the text line) will sometimes be referred to herein as "related information of the text line", which may include, for example, a location identification, a cut-off duty ratio, etc., as shown in fig. 5.
According to some embodiments, the text data and location identification and cut-off duty cycle may be stored in a storage space such as a cache, or in other forms of storage space.
In the present disclosure, the storage of the text data, location identification, and cut-off duty is not necessarily in the order and location illustrated in fig. 5. Moreover, they are not necessarily stored adjacently, but the order and position of their storage may be specifically arranged according to actual demands or actual conditions of storage space, and when description of storage is referred to later, this will not be described in detail.
According to some embodiments, each line of data in the stored text is associated with the stored location identity and the cut-off duty cycle.
As described above, the related information may include a location identification, a cut-off duty cycle, etc. of the text line.
In this disclosure, the meaning of "associatively storing" may include, for example, sequentially storing text data of text lines (i.e., data of corresponding lines in stored text) and related information in the same memory area to facilitate unified collection and management of all text data and related information; it may also include storing the text data of the text lines together, for example in the same storage area, to facilitate reading of the text data, while the related information of the text lines is stored in a different storage area than the text data, to facilitate collection and management of the related information. The "associative memory" may also include other memory means, which are not shown here. In short, the storage mode of the text data and the related information can be flexibly set according to the requirements.
According to some embodiments, when text data is stored separately from the related information, a desired association may be established between the text data store and the related information store, for example, by employing the same index number. For example, location identification may be employed as an index to the text data store and related information store.
On the other hand, a manner of storing only text data in the storage area of text data may also be employed, although this manner makes it appear that there is a lack of correspondence between the stored text data and the positions of text lines, since related information including the cut-off duty ratio is additionally stored, text data of a text line to be read can be quickly found in the storage area of text data by the cut-off duty ratio.
In addition, the number of characters of each text line identified may be stored as one of the related information of the text line, such as together with the related information of the location identification, the cut-off duty ratio, and the like. Here, the operation of storing the number of characters of the text line may be performed in step S110, but of course, may also be performed in step S104 or step S106, which is not limited herein.
Fig. 6 illustrates a character recognition and storage process according to another exemplary embodiment of the present disclosure, and fig. 6 stores more information of the number of characters of a text line than fig. 5.
As shown in fig. 6, for example, in the case where one text line to be recognized is a 1 st line character, the following are stored as text data, the number of characters, the position identification, and the cut-off duty of the 1 st line character, respectively: the "onset", i.e. "turn on vision". [20], [001], [100% ]. Here, fig. 6 shows an example in which the number of characters of the 1 st line character is stored together with the position identification and the cut-off duty.
As described above, the related information may include the number of characters, etc. in addition to the position identification, the cut-off duty ratio of the text line as described above.
Here, storing the number of characters of each text line identified facilitates calculation and updating of the cutoff duty cycle.
According to some embodiments, as shown in fig. 7, the image processing method of the present disclosure may further include: step S120, after the step S101 of acquiring an image, character recognition may be performed on a first text line to be recognized in the text region, and the obtained text data may be separately stored in the first storage area in the text to be read.
That is, the text data and/or related information of the first text line is not stored together with the text data and/or related information of the subsequent text line, but is stored separately in one storage area (e.g., the first storage area in the text to be read).
Here, the "first text line to be recognized" described above may be the 1 st line text of the entire text region, or may be the 1 st line text to be recognized in a part of lines (a part of the entire lines) in the entire text region, instead of the 1 st line text of the entire text region.
The operation performed on the first text line to be identified is separately proposed here to distinguish between different operation modes between the first text line to be identified and the text line to be spliced to store text data. For example, the first line of text may be stored separately, such as in a different memory area (e.g., a first memory area) than the other lines of text. One of the purposes of individual storage is to read quickly, i.e. the reading can be performed after the storage of the text, without waiting for the identification and storage of the subsequent text lines, so that the reading waiting time is greatly reduced, the reading speed is effectively improved, and especially the reading speed of the interested 1 st line text is very helpful for representing the performance of the reading equipment and improving the user experience.
In addition, the related information of the first text line to be identified can be stored together with the text data thereof, so that the access, collection and management are convenient.
After the 1 st line of text of interest is identified and stored separately, the rest text lines of interest can participate in the splice storage of the text data in the identification and storage process of the subsequent text lines, so as to facilitate multi-line reading. Since a plurality of lines may have been recognized and stored during reading 1 line in a case where the recognition and storage speed is faster than the reading speed, the stored text data is sufficient for the reading device to use in a manner of recognizing and storing and reading at the same time, and it is unnecessary to spend a waiting time for recognizing and storing all the text lines as in the prior art, so that the reading waiting time can be greatly reduced and the reading speed and efficiency can be improved.
As shown in fig. 8, the image processing method according to the present disclosure may further include: step S107, after calculating and storing the cut-off duty ratio of the stored text (the first text line is stored separately, so that it is not within the stored text), judging whether there is a next text line to be recognized in the text area, if so, going to step S102, performing the character recognition operation of step S102 for the next text line to be recognized, and continuing the operations of steps S103-S107 in sequence. The operations of steps S102 to S107 are performed in such a loop until all the text lines to be recognized in the text area are recognized and stored. That is, if there is no text line to be recognized next, the recognition and storage of the text region may be ended.
The process of identifying and storing the text lines in a text region will be described in connection with specific examples, and in particular how to update the cut-off duty cycle.
In the example illustrated here, in step S120, the 1 st line text of the text region shown in fig. 2 is recognized and stored separately. Text data of the line 1 text may be stored separately in, for example, the first storage area of the text to be read, and may also be stored in the stored text.
In addition, in step S120, information about the 1 st line text may be stored. As shown in fig. 6, the text data of the 1 st line of characters and related information (including the number of characters, the position identifier and the cut-off ratio) stored separately are:
the "onset", i.e. "turn on vision". [20], [001], [100% ].
As described above, the text data of the text at line 1 is stored separately from the related information, i.e., not stored with the subsequent text line. In the case of separate storage, the off-duty of the 1 st line text may not be stored. Note that, as shown in fig. 5, the number of characters of the 1 st line character may not be stored.
As described above, the text data of the first text line is stored separately for quick reading.
Then, in steps S102 to S104, the line 2 text of the text region may be identified and stored into the text to be read (S103) and the stored text (S104), respectively.
As shown in fig. 9 (a), text data of the text to be read and the 2 nd line text in the stored text at this time is:
[ cause electronics are dedicated to computer vision ].
Since the line 1 text can be stored separately in, for example, the first memory area of the text to be read and/or in, for example, the first memory area of the stored text, no other line of text is currently stored together, with a cut-off of 100%. Thus, for the line 2 text, steps S105 to S106 may be omitted, since the off-duty ratio thereof may be directly determined at step S104.
Note that the cut-off ratio of the 2 nd line text, which is the first line data in the stored text at this time, is 100% as that of the first line of the text region, because the 2 nd line text is stored in a different storage space from the 1 st line text for the purpose of facilitating quick reading, as described above. However, after the next line of text to be recognized is stored, the cut-off duty of the line 2 text will be updated to a different value.
Next, in step S107, it is determined whether there is a next line of text to be recognized. If it is determined that there is a next text line to be recognized (for example, 3 rd text line), the process proceeds to step S102, and the 3 rd text line is continuously recognized, and text data of the 3 rd text line is stored in steps S103 and S104, respectively. Then, the location identifier of the line of text may be stored in the subsequent step S110, or the number of characters may be stored in this step.
Since the number of characters of the line of characters is 30 and the position mark is 003, the text data and the related information of the 3 rd line of characters stored at this time are:
innovation sum of the application products of the [ processor and artificial intelligence ], [30], and [003].
In step S105, the stored total number of characters, that is, the number of characters in line 2 plus the number of characters in line 3, is calculated, 26+30=56, to update the cut-off duty of each stored line of text preceding line 3 of text, at which time only line 2 of text preceding line 3 of text needs to be updated.
In step S106, a cut-off duty of the stored text (excluding the separately stored line 1 text) is calculated and stored. At this time, the cut-off ratio of the line 2 character is updated from 100% before to 26/56=46% which is the number of characters of the line 2 character/the number of stored total characters, and the cut-off ratio of the line 3 character is 100%.
At this time, as shown in fig. 9 (b), in the case where the 1 st line text is stored alone, the text data and related information of the subsequent text line stored are as follows:
[ cause electrons are directed to computer vision ], [26], [002], [46% >)
Innovation sum of [ manager and artificial intelligence application products ], [30], [003], [100% ].
Next, in step S107, it is determined whether there is a next line of text to be recognized. If it is determined that the next text line to be recognized is, for example, the 4 th text line, the process proceeds to step S102, the 4 th text line is continuously recognized, and text data of the 4 th text line is stored in steps S103 and S104, respectively.
Since the number of characters of the line of characters is 25 and the position mark is 004, the text data and related information of the 4 th line of characters after the storing in step S110 are:
[ development, robot, unmanned aerial vehicle ], [25], [004].
In step S105, the number of characters stored in the total number of 2 nd, 3 rd and 4 th lines, that is, 26+30+25=81, is calculated to update the cut-off duty ratio of the text line before the 4 th line of text, and at this time, the 2 nd and 3 rd lines of text (the 1 st line is stored separately and therefore not included) are located before the 4 th line of text, so that the cut-off duty ratio of the 2 nd and 3 rd lines of text needs to be updated.
In step S106, the cut-off ratio of the 2 nd line character may be updated to the number of characters of the 2 nd line character/the number of stored total characters, i.e., 26/81=32%, the cut-off ratio of the 3 rd line character may be updated to (the number of characters of the 2 nd line character+the number of characters of the 3 rd line character)/the number of stored total characters, i.e., (26+30)/81=69%, and the cut-off ratio of the 4 th line character may be determined to be 100%.
At this time, as shown in fig. 9 (c), text data and related information of the subsequent text lines (lines 2 to 4) of the 1 st line character are stored as follows:
[ cause electronics are directed to computer vision processor and ], [26], [002], [32% >
Innovative sum of [ manager and artificial intelligence application products ], [30], [003], [69% ]
[ development, for robot, unmanned aerial vehicle ], [25], [004], [100% ].
Next, in step S107, it is determined whether there is a next line of text to be recognized. If it is determined that there is a next text line to be recognized, for example, the 5 th text line, the process goes to step S102 to continue recognition of the 5 th text line, and text data of the 5 th text line is stored in steps S103 and S104, respectively.
Since the number of characters of the line of characters is 41 and the position mark is 005, the text data and related information of the 5 th line of characters after the storing in step S110 are:
[ professional fields such as car, security protection control provide end-to-end solution. [41], [005].
In step S105, the number of characters stored in the total number of characters is the number of 2 nd, 3 rd, 4 th and 5 th lines, that is, 26+30+25+41=122, so as to update the cut-off duty ratio of each text line before the 5 th line of text, and at this time, the 5 th line of text is preceded by the 2 nd, 3 rd and 4 th lines of text, so that the cut-off duty ratio of the 2 nd, 3 rd and 4 th lines of text needs to be updated.
In step S106, the cut-off ratio of the 2 nd line character may be updated to the number of characters of the 2 nd line character/the number of stored total characters, i.e., 26/122=21%, the cut-off ratio of the 3 rd line character may be updated to (the number of characters of the 2 nd line character+the number of characters of the 3 rd line character)/the number of stored total characters, i.e., (26+30)/122=46%, and the cut-off ratio of the 4 th line character may be updated to (the number of characters of the 2 nd line character+the number of characters of the 3 rd line character+the number of characters of the 4 th line character)/the number of stored total characters, i.e., (26+30+25)/122=66%, and the cut-off ratio of the 5 th line character is determined to be 100% at this time.
At this time, as shown in fig. 9 (d), the stored text data and related information other than the 1 st line text are as follows:
[ cause electronics are directed to computer vision processor and ], [26], [002], [21% >
Innovative sum of [ manager and artificial intelligence application products ], [30], [003], [46% ]
[ research and development are robot, unmanned aerial vehicle ], [25], [004], [66% ]
[ professional fields such as car, security protection control provide end-to-end solution. [41], [005], [100% ].
According to some embodiments, as described above, the number of characters in the stored related information shown in fig. 9 (a) to (d) is not necessary, that is, the number of characters may not be stored. According to some embodiments, the number of characters in the stored related information shown in fig. 9 (a) - (d) may also be replaced by other parameters, such as the area of the text area occupied by the corresponding text line, and then the cut-off duty ratio in the related information may also be obtained according to the area parameter.
With respect to the calculation and update of the cutoff duty ratio, after the text lines to be identified in the entire text region are identified, the cutoff duty ratio of each text line may be calculated, so that the cutoff duty ratio only needs to be calculated once, and the calculation and update of the cutoff duty ratio do not need to be performed every time the text line is identified.
As described above, the present disclosure is not limited to the storage method, and for example, text data and related information of one text region may be stored together, or the text data and related information may be stored separately.
The above describes an example case where one text region is included in an image, and for a case where a plurality of text regions are included in an image, the above-described recognition and storage operation may be performed for each text region, respectively, until the recognition and storage of all text lines or those of interest in the text region is completed.
According to some embodiments, when multiple text regions are included in one image, the text data for the multiple text regions may be stored together or they may be stored separately, none of which affect the essence of the present disclosure.
According to the present disclosure, stored text data and related information are available for reading. The text data and related information of each text line obtained through the steps are very advantageous for various reading operations, for example, the identification and storage of the text data and the reading of the text data can be performed in parallel, that is, the reading can be started without waiting until all text lines are identified and stored as in the prior art, but the text data can be read while being identified and stored, and the identification and the storage of the text data are not influenced, so that the reading speed is greatly improved. The reading will be described in detail below.
According to some embodiments, the text data in the currently stored text to be read is read while said identifying and said storing are performed sequentially for the text region.
For example, text data in a currently stored text to be read may be read while sequentially recognizing and storing the text data. For example, if the text in one text area is arranged in rows, each text data in the currently stored text to be read may be read while text data is detected and identified in rows and each identified row of text data is stored in turn.
According to some embodiments, it may be that whenever a reading device acquires and reads text to be read once, the text to be read becomes the read text. Thus, the text data in the currently stored text to be read may be newly identified and unread text data that is newly stored after the reading device acquires the last text to be read.
For example, assuming that the currently stored text to be read is text data of lines 2 to 3 of one text area, at this time, the reading device acquires the text to be read and reads the text, and the text to be read becomes the read text. And the newly identified text data of the 4 th to 6 th lines is currently stored as the text to be read, and the text to be read, which is acquired and read next time by the reading device, is the text data of the 4 th to 6 th lines.
By splicing and storing the text data of the text lines as described above, the consistency of reading can be realized, unnecessary clamping between the lines in the prior art is reduced, and reading is started only after all the text lines are recognized and stored as in the prior art, so that the reading efficiency is improved.
An example of sequential reading will be specifically described below with reference to fig. 10. For example, for the text region shown in fig. 2, reading may be started immediately after the first text line is identified and stored (e.g., in step S120) (i.e., "first reading" shown in fig. 10), so as to shorten the reading waiting time and increase the reading speed; and after the first text line is identified and stored, that is, the first text line is read, the subsequent text line is still identified and stored, so that the beneficial technical effects of reading while identifying and storing are realized.
After the text data reading of the first text line is completed, the currently stored text to be read is continued to be read (i.e., "second reading" shown in fig. 10). Assuming that the recognition and storage of the 2 nd line of text and the recognition and storage of the 3 rd line of text have been already undergone in the process of reading the text data of the first text line, the currently stored text to be read is the 2 nd line of text and the 3 rd line of text: [ cause electronics focus on computer vision ], [ innovation sum of processor and artificial intelligence application products ] (the stored text data of line 1 may be text that has become read). Thus, the text data of lines 2 and 3 in the text to be read is continuously read. That is, the content of the second reading is [ the onset electronics are dedicated to the innovative sum of the computer vision processor and the artificial intelligence application product ], thereby making the speech between the 2 nd and 3 rd line words of the second reading coherent, with semantic links and context, overcoming the hard and mechanical spacing or jamming that occurs in the prior art when reading word by word or line by line.
In addition, in the process of the second reading, the identification and storage operation is still carried out on the subsequent text line, so that the beneficial technical effects of reading while identifying and storing are realized. After the second reading, the reading may be continued, for example, a third reading, etc., in order to read contents (such as line 4, line 5, etc.) which have not been read out of the currently stored text data. And the method is circulated until the text lines in the whole text area are read, and the sequential reading process of the text area is completed.
By adopting the spliced storage mode, the method realizes reading while identifying and storing, thereby realizing more coherent and smooth reading. The methods and related devices of the present disclosure may help, for example, visually impaired users, elderly or elderly users, dyskinesia users, etc., to more easily understand and understand information automatically read from text regions by, for example, visually impaired assistive devices.
According to some embodiments, the text data of a stored text line may be modified and the number of characters and cut-off duty cycle of the text line updated accordingly. For example, modification operations such as replacement (such as replacement of a character that is incorrectly recognized with a correct character), deletion (such as deletion of an unnecessary character), addition (such as addition of a missing character that is not recognized), and the like may be performed on characters in the stored text data to make the stored text data more accurate.
After the above-described modification, the number of characters of the text data may vary, so that the number of characters of the modified text data, and the corresponding cutoff duty ratio, may be updated accordingly, so that the stored corresponding information is accurate.
According to some embodiments, for a particular type of text line, a particular type of location identification representing the type of text line is stored and based on the particular type identification, a prompt is issued to the user upon reading.
For such a particular type of text line as described above, a particular type identification may be stored that represents the type of the text line. If, at the time of reading, it is determined that a certain text line to be read corresponds to one such specific type of identification, a corresponding prompt can be issued to the user. For example, if it is determined that a text line to be read is a title line, the user may be prompted for information such as "this is a title line" or the like. If it is determined that a text line to be read is a ambiguous line, the user may be prompted for information such as "cannot recognize the line of text, please understand" or the like.
According to some embodiments, the cues may comprise one of an audible cue, a vibration cue, a text cue, an image cue, a video cue, or a combination thereof.
According to some embodiments, the particular type of text line comprises: a first type of text line, wherein the first type of text line is determined by a text size; and a second type of text line, wherein the second type of text line is determined by text line sharpness. For example, the first type of text lines may be header lines, headers, footers, etc., which tend to be different in text size from other text lines. In addition, the second type of text line refers to text lines that cannot be clearly identified, i.e., text lines that have low text clarity (e.g., below a preset text clarity threshold).
According to some embodiments, the text lines may be arranged in a lateral, vertical, or diagonal direction.
According to another aspect of the present disclosure, there is also provided an electronic circuit, which may include: circuitry configured to perform the steps of the method described above.
According to another aspect of the present disclosure, there is also provided a vision impairment assisting apparatus, including: a camera configured to acquire an image; the electronic circuit described above; circuitry configured to perform text detection and recognition of text contained in the image to obtain text data; circuitry configured to read the text data.
According to another aspect of the present disclosure, there is also provided an electronic apparatus including: a processor; and a memory storing a program comprising instructions that when executed by the processor cause the processor to perform the method described above.
According to another aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium storing a program comprising instructions which, when executed by a processor of an electronic device, cause the electronic device to perform the above-described method.
Fig. 11 is a block diagram illustrating an example of an electronic device according to an exemplary embodiment of the present disclosure. It is noted that the structure shown in fig. 10 is only one example, and that the electronic device of the present disclosure may include only one or more of the components shown in fig. 10, depending on the particular implementation.
The electronic device 2000 may be, for example, a general-purpose computer (e.g., a laptop computer, a tablet computer, etc., various computers), a mobile phone, a personal digital assistant, etc. According to some embodiments, the electronic device 2000 may be a vision impairment aid or a reading aid.
The electronic device 2000 may be configured to capture images, process the captured images, and provide prompts in response to the processing. For example, the electronic device 2000 may be configured to capture an image, perform text detection and recognition on the image to obtain text data, convert the text data to sound data, and may output the sound data for listening by a user and/or output the text data for viewing by the user.
According to some embodiments, the electronic device 2000 may be configured to include a spectacle frame or be configured to be removably mountable to a spectacle frame (e.g., a rim of a spectacle frame, a connection connecting two rims, a temple, or any other portion) so as to be able to capture an image that approximately includes a field of view of a user.
According to some embodiments, the electronic device 2000 may also be mounted to or integrated with other wearable devices. The wearable device may be, for example: head-mounted devices (e.g., helmets, hats, etc.), devices that can be worn on the ears, etc. According to some embodiments, the electronic device may be implemented as an accessory attachable to a wearable device, for example as an accessory attachable to a helmet or hat, etc.
According to some embodiments, the electronic device 2000 may also have other forms. For example, the electronic device 2000 may be a mobile phone, a general-purpose computing device (e.g., a laptop computer, a tablet computer, etc.), a personal digital assistant, and so on. The electronic device 2000 may also have a base so that it can be placed on a desktop.
According to some embodiments, the electronic device 2000 may be used as a vision-impaired aid to aid reading, in which case the electronic device 2000 is sometimes also referred to as an "electronic reader" or "reading aid". By means of the electronic device 2000, a user who cannot read autonomously (e.g., visually impaired person, person with reading impairment, etc.) can take a posture similar to the reading posture to achieve "reading" of a regular reading material (e.g., book, magazine, etc.). In the reading process, the electronic device 2000 may acquire an image, perform character recognition on text lines in the image to obtain text data, store the obtained text data to a text to be read for reading, store the text data to a stored text, and store related information such as a position identifier (e.g. a line number), a number of characters, a cut-off duty ratio, etc. of the line text, so as to facilitate quick reading of the text data, and enable the read text data to have semantic links and context, so as to avoid hard click caused by line-by-line or word-by-word reading.
The electronic device 2000 may include a camera 2004 for capturing and acquiring images. The camera 2004 may capture still images or may capture moving images, and may include, but is not limited to, a video camera, etc., configured to acquire an initial image including an object to be identified. The electronic device 2000 may also include electronic circuitry 2100, which electronic circuitry 2100 includes circuitry configured to perform the steps of the methods described previously. The electronic device 2100 may also include a text recognition circuit 2005, the text recognition circuit 2005 configured to perform text detection and recognition (e.g., OCR processing) of text in the image to obtain text data. The word recognition circuit 2005 may be implemented by a dedicated chip, for example. The electronic device 2000 may also include a sound conversion circuit 2006, the sound conversion circuit 2006 being configured to convert the text data into sound data. The sound conversion circuit 2006 may be implemented by a dedicated chip, for example. The electronic device 2000 may further comprise a sound output circuit 2007, the sound output circuit 2007 being configured to output the sound data. The sound output circuit 2007 may include, but is not limited to, headphones, speakers, or vibrators, etc., and their corresponding driving circuits. The aforementioned recognition device may include, for example, the word recognition circuit 2005, and the aforementioned reading device may include, for example, the sound conversion circuit 2006 and the sound output circuit 2007.
According to some embodiments, the electronic device 2000 may also include image processing circuitry 2008, which image processing circuitry 2008 may include circuitry configured to perform various image processing on images. The image processing circuit 2008 may include, for example, but is not limited to, one or more of the following: a circuit configured to reduce noise in an image, a circuit configured to deblur an image, a circuit configured to geometrically correct an image, a circuit configured to extract features of an image, a circuit configured to target detect and identify a target object in an image, a circuit configured to detect text contained in an image, a circuit configured to extract text lines from an image, a circuit configured to extract text coordinates from an image, and the like.
According to some embodiments, the electronic circuit 2100 may further include a word processing circuit 2009, which word processing circuit 2009 may be configured to perform various processes based on the extracted word related information (e.g., word data, text box, paragraph coordinates, text line coordinates, word coordinates, etc.), thereby obtaining processing results such as paragraph ordering, word semantic analysis, layout analysis results, etc.
For example, one or more of the various circuits described above may be implemented using assembly language or hardware programming language (such as VERILOG, VHDL, c++) using logic and algorithms according to the present disclosure, the hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) may be implemented using custom hardware, and/or may be implemented using hardware, software, firmware, middleware, microcode, hardware description language, or any combination thereof.
According to some embodiments, electronic device 2000 may also include communication circuitry 2010, which communication circuitry 2010 may be any type of device or system that enables communication with an external device and/or with a network and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication device, and/or a chipset, such as a bluetooth device, 1302.11 device, wiFi device, wiMax device, cellular communication device, and/or the like.
According to some embodiments, the electronic device 2000 may also include an input device 2011, which input device 2011 may be any type of device capable of inputting information to the electronic device 2000 and may include, but is not limited to, various sensors, mice, keyboards, touch screens, buttons, levers, microphones, and/or remote controls, and the like.
According to some implementations, the electronic device 2000 may also include an output device 2012, which output device 2012 may be any type of device capable of presenting information, and may include, but is not limited to, a display, a visual output terminal, a vibrator, and/or a printer, among others. Although the electronic device 2000 is used in a vision-based output device according to some embodiments, the vision-based output device may facilitate a user's family or maintenance staff, etc., to obtain output information from the electronic device 2000.
According to some embodiments, the electronic device 2000 may also include a processor 2001. The processor 2001 may be any type of processor and may include, but is not limited to, one or more general purpose processors and/or one or more special purpose processors (e.g., special processing chips). The processor 2001 may be, for example, but not limited to, a central processing unit CPU or a microprocessor MPU, or the like. The electronic device 2000 may also include a working memory 2002, which working memory 2002 may store programs (including instructions) and/or data (e.g., images, text, sound, other intermediate data, etc.) useful for the operation of the processor 2001, and may include, but is not limited to, random access memory and/or read only memory devices. The electronic device 2000 may also include a storage device 2003, which storage device 2003 may include any non-transitory storage device that may be non-transitory and may enable data storage, and may include, but is not limited to, a magnetic disk drive, an optical storage device, solid state memory, a floppy disk, a flexible disk, a hard disk, magnetic tape, or any other magnetic medium, an optical disk or any other optical medium, a ROM (read only memory), a RAM (random access memory), a cache memory, and/or any other memory chip or cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. Working memory 2002 and storage device 2003 may be collectively referred to as "memory" and may in some cases be used in combination with each other.
According to some embodiments, the processor 2001 may control and schedule at least one of the camera 2004, the word recognition circuit 2005, the sound conversion circuit 2006, the sound output circuit 2007, the image processing circuit 2008, the word processing circuit 2009, the communication circuit 2010, the electronic circuit 2100, and other various devices and circuits included in the electronic device 2000. According to some embodiments, at least some of the various components described in fig. 11 may be interconnected and/or in communication by lines 2013.
Software elements (programs) may reside in the working memory 2002 including, but not limited to, an operating system 2002a, one or more application programs 2002b, drivers, and/or other data and code.
According to some embodiments, instructions for performing the foregoing control and scheduling may be included in the operating system 2002a or one or more application programs 2002 b.
According to some embodiments, instructions to perform the method steps described in the present disclosure may be included in one or more applications 2002b, and the various modules of the electronic device 2000 described above may be implemented by the instructions of one or more applications 2002b being read and executed by the processor 2001. In other words, the electronic device 2000 may include a processor 2001 and a memory (e.g., working memory 2002 and/or storage device 2003) storing a program comprising instructions that, when executed by the processor 2001, cause the processor 2001 to perform methods as described in various embodiments of the present disclosure.
According to some embodiments, some or all of the operations performed by at least one of text recognition circuit 2005, sound conversion circuit 2006, image processing circuit 2008, text processing circuit 2009, electronic circuit 2100 may be implemented by processor 2001 reading and executing instructions of one or more application programs 2002.
Executable code or source code of instructions of software elements (programs) may be stored in a non-transitory computer readable storage medium (such as the storage device 2003) and may be stored in the working memory 2001 (possibly compiled and/or installed) when executed. Accordingly, the present disclosure provides a computer readable storage medium storing a program comprising instructions that, when executed by a processor of an electronic device (e.g., a vision-impaired assisting device), cause the electronic device to perform a method as described in various embodiments of the present disclosure. According to another embodiment, executable code or source code of instructions of the software elements (programs) may also be downloaded from a remote location.
It should also be understood that various modifications may be made according to specific requirements. For example, custom hardware may also be used, and/or individual circuits, units, modules or elements may be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. For example, some or all of the circuits, units, modules, or elements contained in the disclosed methods and apparatus may be implemented by programming hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) in an assembly language or hardware programming language such as VERILOG, VHDL, c++ using logic and algorithms according to the present disclosure.
According to some embodiments, the processor 2001 in the electronic device 2000 may be distributed over a network. For example, some processes may be performed using one processor while other processes may be performed by another processor remote from the one processor. Other modules of the electronic device 2001 may also be similarly distributed. As such, the electronic device 2001 may be interpreted as a distributed computing system that performs processing in multiple locations.
Some exemplary aspects of the disclosure will be described below.
Aspect 1. An image processing method, the method comprising:
acquiring an image, wherein the image comprises a text region;
in the text region, carrying out character recognition on a text line to be recognized so as to obtain text data of the text line; and
and storing the text data of the text line to the text to be read.
Aspect 2. The image processing method according to aspect 1, further comprising:
storing text data of the text line to a stored text as a line of data in the stored text;
calculating the sum of the character numbers of the stored text as the total stored character number; and
and calculating and storing a cut-off duty ratio of each line of data in the stored text, wherein the cut-off duty ratio is determined by the ratio of the sum of the number of characters before the line of data in the stored text and the number of characters of the line of data to the total number of stored characters.
Aspect 3 the image processing method according to aspect 2, further comprising:
a location identification of the text line is stored for representing a location of the text line in the text region, wherein the location identification comprises a line number.
Aspect 4. The image processing method according to aspect 3, wherein each line of data in the stored text is associated with the stored location identity and the cut-off duty cycle.
Aspect 5. The image processing method according to aspect 1, further comprising:
after the image is acquired, character recognition is carried out on the first text line to be recognized in the text area, and the obtained text data are independently stored in a first storage area in the text to be read.
Aspect 6 the image processing method according to aspect 2, further comprising:
after calculating and storing the cut-off duty ratio of each line of data in the stored text, judging whether a next text line to be recognized exists in the text area; and
if there is a next text line to be recognized, character recognition is performed on the next text line to be recognized.
Aspect 7. The image processing method according to aspect 1, further comprising:
and reading the text data in the currently stored text to be read while sequentially carrying out the identification and the storage on the text region.
Aspect 8. The image processing method according to aspect 2, wherein the stored text is modified and the corresponding cut-off duty cycle is updated based on the modification.
Aspect 9. The image processing method according to aspect 8, wherein the modification includes addition, deletion, or substitution.
Aspect 10. The image processing method according to aspect 1, wherein, for a specific type of text line, a specific type location identification for representing the type of the text line is stored, and based on the specific type identification, a prompt is issued to the user at the time of reading.
Aspect 11. The image processing method according to aspect 10, wherein the specific type text line includes:
a first type of text line, wherein the first type of text line is determined by a text size; and
a second type of text line, wherein the second type of text line is determined by text line sharpness.
The image processing method according to any one of aspects 1 to 11, wherein the text lines are arranged in a lateral direction, a vertical direction, or an oblique direction.
Aspect 13. An electronic circuit, comprising:
circuitry configured to perform the steps of the method according to any one of aspects 1 to 12.
Aspect 14. A vision impairment assisting apparatus comprising:
a camera configured to acquire an image;
the electronic circuit of aspect 13;
circuitry configured to perform text detection and recognition of text contained in the image to obtain text data; and
circuitry configured to read the text data.
Aspect 15. An electronic device, comprising:
a processor; and
a memory storing a program comprising instructions that when executed by the processor cause the processor to perform the method according to any one of aspects 1 to 12.
Aspect 16. A non-transitory computer readable storage medium storing a program comprising instructions that, when executed by a processor of an electronic device, cause the electronic device to perform the method of any of aspects 1-12.
Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely illustrative embodiments or examples and that the scope of the present disclosure is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims (15)

1. An image processing method, the method comprising:
acquiring an image, wherein the image comprises a text region;
in the text region, carrying out character recognition on a text line to be recognized so as to obtain text data of the text line; and
storing the text data of the text line to a text to be read;
storing text data of the text line to a stored text as a line of data in the stored text;
calculating the sum of the character numbers of the stored text as the total stored character number; and
and calculating and storing a cut-off duty ratio of each line of data in the stored text, wherein the cut-off duty ratio is determined by the ratio of the sum of the number of characters before the line of data in the stored text and the number of characters of the line of data to the total number of stored characters.
2. The image processing method according to claim 1, further comprising:
a location identification of the text line is stored for representing a location of the text line in the text region, wherein the location identification comprises a line number.
3. The image processing method of claim 2, wherein each line of data in the stored text is associated with the stored location identity and the cut-off duty cycle.
4. The image processing method according to claim 1, further comprising:
after the image is acquired, character recognition is carried out on the first text line to be recognized in the text area, and the obtained text data are independently stored in a first storage area in the text to be read.
5. The image processing method according to claim 1, further comprising:
after calculating and storing the cut-off duty ratio of each line of data in the stored text, judging whether a next text line to be recognized exists in the text area; and
if there is a next text line to be recognized, character recognition is performed on the next text line to be recognized.
6. The image processing method according to claim 1, further comprising:
and reading the text data in the currently stored text to be read while sequentially carrying out the identification and the storage on the text region.
7. The image processing method of claim 1, wherein the stored text is modified and the corresponding cutoff duty cycle is updated based on the modification.
8. The image processing method of claim 7, wherein the modification comprises addition, deletion, or substitution.
9. The image processing method according to claim 1, wherein for a specific type of text line, a specific type position identification for representing the type of the text line is stored, and based on the specific type identification, a prompt is issued to a user at the time of reading.
10. The image processing method of claim 9, wherein the particular type of text line comprises:
a first type of text line, wherein the first type of text line is determined by a text size; and
a second type of text line, wherein the second type of text line is determined by text line sharpness.
11. The image processing method according to any one of claims 1 to 10, wherein the text lines are arranged in a lateral direction, a vertical direction, or an oblique direction.
12. An electronic circuit, comprising:
circuitry configured to perform the steps of the method according to any one of claims 1 to 11.
13. A vision impairment aiding device comprising:
a camera configured to acquire an image;
the electronic circuit of claim 12;
circuitry configured to perform text detection and recognition of text contained in the image to obtain text data;
circuitry configured to read the text data.
14. An electronic device, comprising:
a processor; and
a memory storing a program comprising instructions that when executed by the processor cause the processor to perform the method of any one of claims 1 to 11.
15. A non-transitory computer readable storage medium storing a program, the program comprising instructions that, when executed by a processor of an electronic device, cause the electronic device to perform the method of any one of claims 1-11.
CN201911214755.0A 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium Active CN110969161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911214755.0A CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911214755.0A CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Publications (2)

Publication Number Publication Date
CN110969161A CN110969161A (en) 2020-04-07
CN110969161B true CN110969161B (en) 2023-11-07

Family

ID=70032598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911214755.0A Active CN110969161B (en) 2019-12-02 2019-12-02 Image processing method, circuit, vision-impaired assisting device, electronic device, and medium

Country Status (1)

Country Link
CN (1) CN110969161B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113920286B (en) * 2020-06-22 2025-05-02 北京字节跳动网络技术有限公司 Character positioning method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596168A (en) * 2018-04-20 2018-09-28 北京京东金融科技控股有限公司 For identification in image character method, apparatus and medium
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10558701B2 (en) * 2017-02-08 2020-02-11 International Business Machines Corporation Method and system to recommend images in a social application

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment
CN108596168A (en) * 2018-04-20 2018-09-28 北京京东金融科技控股有限公司 For identification in image character method, apparatus and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张再银 ; 童立靖 ; 湛健 ; 沈冲 ; .基于文本域分割和文本行检测的扭曲文档图像校正.电脑与信息技术.2015,(01),全文. *

Also Published As

Publication number Publication date
CN110969161A (en) 2020-04-07

Similar Documents

Publication Publication Date Title
EP3940589A1 (en) Layout analysis method, electronic device and computer program product
US10127471B2 (en) Method, device, and computer-readable storage medium for area extraction
CN110991455B (en) Image text broadcasting method and equipment, electronic circuit and storage medium thereof
US10592726B2 (en) Manufacturing part identification using computer vision and machine learning
US10621428B1 (en) Layout analysis on image
EP2849031A1 (en) Information processing apparatus and information processing method
CN111126394A (en) Character recognition method, reading aid, circuit and medium
CN111353501A (en) Book point-reading method and system based on deep learning
CN111242273B (en) Neural network model training method and electronic equipment
KR20190063277A (en) The Electronic Device Recognizing the Text in the Image
JP2006107048A (en) Gaze correspondence control apparatus and gaze correspondence control method
US11367296B2 (en) Layout analysis
CN112001872A (en) Information display method, device and storage medium
CN113780201A (en) Hand image processing method and device, equipment and medium
US10796187B1 (en) Detection of texts
CN111754414A (en) Image processing method and device for image processing
JP2021129299A5 (en)
CN110969161B (en) Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
EP3467820A1 (en) Information processing device and information processing method
CN114281236B (en) Text processing method, apparatus, device, medium, and program product
CN111611986A (en) Focus text extraction and identification method and system based on finger interaction
US11776286B2 (en) Image text broadcasting
WO2020244076A1 (en) Face recognition method and apparatus, and electronic device and storage medium
KR20140134844A (en) Method and device for photographing based on objects
CN117289804B (en) Virtual digital human facial expression management method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant