JP6279025B2

JP6279025B2 - Image processing apparatus, image processing apparatus control method, and program

Info

Publication number: JP6279025B2
Application number: JP2016148521A
Authority: JP
Inventors: 剛横溝
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-07-28
Filing date: 2016-07-28
Publication date: 2018-02-14
Anticipated expiration: 2032-04-12
Also published as: JP2016226010A

Description

本発明は、画像処理装置、画像処理装置の制御方法、及びプログラムに関する。 The present invention relates to an image processing apparatus, a control method for the image processing apparatus, and a program.

近年、文書を作成する際、単に文字を打ち込むのみならず、フォントに装飾を凝らしたり、図を自由に作成したり、あるいは写真等を取りこんだりといった、高度な機能が用いられるようになっている。 In recent years, when creating a document, advanced functions have been used, such as simply embedding characters, decorating fonts, creating drawings freely, or incorporating photos, etc. .

作成物の内容が高度になるほど、文書をまったく新規から作成する場合には大きな労力が必要とされる。従って、できるだけ過去に作成した文書の一部をそのまま、あるいは加工編集したものを再利用できるようにすることが望まれている。 The higher the content of the creation, the greater the effort required to create a completely new document. Therefore, it is desired that a part of a document created in the past as much as possible can be reused as it is or after being edited and edited.

また、インターネットに代表されるようなネットワークの広がりにより、文書が電子的に配布される機会も増えたが、電子文書が紙に印刷された紙文書で配布されることも多い。 In addition, with the spread of networks such as the Internet, opportunities for electronic distribution of documents have increased, but electronic documents are often distributed as paper documents printed on paper.

紙文書しか手元に存在しない場合でも、紙文書の内容を再利用可能なデータとして得られるようにするための技術が考えられている。 A technique for allowing the contents of a paper document to be obtained as reusable data even when only the paper document exists at hand is considered.

例えば、紙文書を電子的に読み込ませた際に、その内容と一致する文書をデータベースから検索して取得し、読み込んだ紙文書のデータの代わりに利用できる技術が開示されている（例えば、特許文献１参照）。 For example, when a paper document is electronically read, a technique is disclosed in which a document that matches the content is retrieved from a database and can be used in place of the read paper document data (for example, patents). Reference 1).

また、同一の文書がデータベースから検索できなかった場合は、紙文書の内容を再利用が容易な電子データへと変換するため、この場合も紙文書の内容を再利用することができる。 If the same document cannot be retrieved from the database, the content of the paper document is converted into electronic data that can be easily reused. In this case as well, the content of the paper document can be reused.

このような紙文書からのデータに関し、文書画像中の文字情報を再利用が容易な電子データへと変換する技術として、ＯＣＲ技術がある。また、線や面で構成される図画情報を再利用が容易なデータへと変換する技術として、ベクトル化の技術がある。 Regarding data from such a paper document, there is an OCR technique as a technique for converting character information in a document image into electronic data that can be easily reused. Further, there is a vectorization technique as a technique for converting graphic information composed of lines and planes into data that can be easily reused.

こうした技術を用いて、特許文献１では、文書画像中の文字を文字コードにしたり、図形の輪郭をベクトルデータにすることで、再利用可能なデータへと変換する技術が開示されている。 Using such a technique, Patent Document 1 discloses a technique for converting a character in a document image into a character code or converting a figure outline into vector data to convert it into reusable data.

さらに、特許文献１では、文書画像中の文字、線画、自然画、表などの領域を識別し、各領域の関係をツリー構造で表現するデータを構築する技術が開示されている。 Further, Patent Document 1 discloses a technique for identifying areas such as characters, line drawings, natural images, and tables in a document image and constructing data that represents the relationship between the areas in a tree structure.

そして、この技術では、同構造に従って上記文字コードやベクトルデータ、画像データ等を配置することで、アプリケーションで編集可能な電子文書ページへの変換を行っている。 In this technique, the character code, vector data, image data, and the like are arranged in accordance with the same structure to convert the document into an electronic document page that can be edited by an application.

こうして得られたデータは、元文書と同等のレイアウトを持ち、文書作成アプリケーション等で新規作成した電子文書ページと同様、文字や図形の位置やサイズの変更、さらに幾何学的な変形や色付けなどを容易に行うことができるデータとなっている。 The data obtained in this way has the same layout as the original document, and changes the position and size of characters and figures, as well as geometric deformation and coloring, just like an electronic document page newly created by a document creation application. The data can be easily performed.

また、文書画像中の表形式領域の構造を認識する技術がある。例えば、表内の矩形枠領域によって構成される行列構造を取得する技術が開示されている（例えば、特許文献２参照）。 There is also a technique for recognizing the structure of a tabular area in a document image. For example, a technique for acquiring a matrix structure constituted by rectangular frame regions in a table is disclosed (for example, see Patent Document 2).

この技術によって得られる枠領域の行構造と、上記技術による枠内文字のＯＣＲ結果を組み合わせることで、文書画像中の表領域を、表構造を持つ電子データへと変換することが可能である。 By combining the line structure of the frame area obtained by this technique and the OCR result of the character in the frame by the above technique, it is possible to convert the table area in the document image into electronic data having a table structure.

特開２００４−２６５３８４号公報JP 2004-265384 A 特開平１−１２９３５８号公報JP-A-1-129358

上述した従来技術により、ベクトルデータや切り出し画像（文字、線画、自然画、表などの領域（オブジェクト））を示す前景画像と背景画像に分けることができる。 According to the above-described conventional technology, it is possible to divide into foreground images and background images indicating vector data and cut-out images (regions (objects) such as characters, line drawings, natural images, and tables).

この背景画像は、元の画像から、前景画像が存在する領域の画素情報を消去することによって生成される。 This background image is generated by erasing the pixel information of the area where the foreground image exists from the original image.

図６は、背景画像を説明するための図であり、（Ａ）は元の画像を示し、（Ｂ）は背景画像を示している。 6A and 6B are diagrams for explaining the background image, where FIG. 6A shows the original image and FIG. 6B shows the background image.

図６（Ａ）における線図形部分、すなわち文字画素塊６０１〜６０３、線画画素塊６０８、及び表枠画素塊６０４の線図形部分画素は、（Ｂ）の背景画像において、周辺の画素色で塗りつぶされた状態となっている。 In FIG. 6A, the line graphic portions, that is, the line graphic portion pixels of the character pixel block 601 to 603, the line drawing pixel block 608, and the table frame pixel block 604 are filled with the surrounding pixel colors in the background image of FIG. It is in the state.

また、自然画領域６０９に関しては、その矩形範囲全体が周辺の画素色で塗りつぶされた状態となっている。 Further, regarding the natural image region 609, the entire rectangular range is filled with surrounding pixel colors.

このような背景画像に関して、ユーザの再利用性を高めるため、背景画像を付加せずにデータを生成する機能が知られている。 With respect to such a background image, a function of generating data without adding a background image is known in order to improve reusability for the user.

この機能を有効にすると、画像に文字データなどの前景画像が存在しないページに対しては、データが生成されないため、ページ（画像）そのものが出力されなくなってしまう。 If this function is enabled, data is not generated for a page in which no foreground image such as character data exists in the image, so the page (image) itself is not output.

従って、原稿のページ数と生成されたデータのページ数が異なってしまうという課題がある。ページ数が異なり、さらに原稿の枚数が多いと、どのページが抜けているか把握することが困難になる。 Therefore, there is a problem that the number of pages of the original and the number of pages of the generated data are different. If the number of pages is different and the number of originals is large, it is difficult to grasp which pages are missing.

また、原稿を保持している人とデータ化された文書を受け取る人が異なる場合、受け取った人はページが抜けていることが分からない。 Also, if the person holding the manuscript is different from the person receiving the data document, the person who received the document does not know that the page is missing.

本発明の目的は、原稿の画像からオブジェクトを抽出できなくても、その画像に相当するページの抜けが生じない仕組みを提供することにある。 An object of the present invention is to provide a without can extract objects from a document image, no omission of the page corresponding to that image works.

上記目的を達成するために、請求項１の画像処理装置は、原稿を読み取る読取手段と、前記読取手段によって読み取られた原稿の画像からオブジェクトを抽出する抽出手段と、前記抽出手段によって抽出されたオブジェクトを含み、前記抽出手段によって抽出されなかった残りの部分を含まないページ画像を生成する生成手段とを有し、前記生成手段は、前記抽出手段によってオブジェクトを抽出できなくても、前記画像に相当するページを生成することを特徴とする。 In order to achieve the above object, an image processing apparatus according to claim 1 is a document reading unit that reads a document, an extraction unit that extracts an object from an image of a document read by the reading unit, and the extraction unit extracts the object. look including the object, and a generating means for generating a page image that does not include the rest which has not been extracted by the extraction means, said generating means also be able to extract an object by said extracting means, the image A page corresponding to is generated.

本発明によれば、原稿の画像からオブジェクトを抽出できなくても、その画像に相当するページの抜けが生じないようにすることができる。 According to the present invention, even without can extract objects from a document image, it is possible to prevent the occurrence omission of the page corresponding to that image.

本発明の実施の形態に係るＭＦＰを含む画像処理システムの一例を示す図である。1 is a diagram illustrating an example of an image processing system including an MFP according to an embodiment of the present invention. 図１におけるＭＦＰの概略構成を示す図である。FIG. 2 is a diagram illustrating a schematic configuration of an MFP in FIG. 1. 図２におけるＣＰＵにより実行される画像生成処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the image generation process performed by CPU in FIG. 図２におけるＣＰＵにより実行される画像生成処理の変形例の手順を示すフローチャートである。It is a flowchart which shows the procedure of the modification of the image generation process performed by CPU in FIG. 図４における告知画像を示す図である。It is a figure which shows the notification image in FIG. 背景画像を説明するための図であり、（Ａ）は元の画像を示し、（Ｂ）は背景画像を示している。It is a figure for demonstrating a background image, (A) shows the original image, (B) has shown the background image.

以下、本発明の実施の形態について図面を参照しながら詳述する。なお、本実施の形態では、本発明に係る画像処理装置をＭＦＰ（Multi Function Peripheral）に適用した実施の形態について説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the present embodiment, an embodiment in which the image processing apparatus according to the present invention is applied to an MFP (Multi Function Peripheral) will be described.

図１は、本発明の実施の形態に係るＭＦＰ１００を含む画像処理システムの一例を示す図である。 FIG. 1 is a diagram illustrating an example of an image processing system including an MFP 100 according to an embodiment of the present invention.

図１において、画像処理システム１は、ＭＦＰ１００、プロキシサーバ１０３、及びクライアントＰＣ１０１で構成され、それらはＬＡＮ１０２により接続されている。 In FIG. 1, the image processing system 1 includes an MFP 100, a proxy server 103, and a client PC 101, which are connected via a LAN 102.

ＭＦＰ１００は、画像処理に係る複数種類の機能（例えば、複写機能、印刷機能、送信機能等）を実現する複合機である。 The MFP 100 is a multi-function machine that implements a plurality of types of functions related to image processing (for example, a copy function, a print function, a transmission function, etc.).

クライアントＰＣ１０１は、例えば、印刷データをＭＦＰ１００へ送信することで、その印刷データに基づく印刷物をＭＦＰ１００で印刷することが可能である。 For example, the client PC 101 can print a printed matter based on the print data by the MFP 100 by transmitting the print data to the MFP 100.

また、ＬＡＮ１０２は、プロキシサーバ１０３を介して外部との通信を可能とするネットワーク１０４に接続されている。 The LAN 102 is connected to a network 104 that enables communication with the outside via a proxy server 103.

このネットワーク１０４は、データの送受信が可能であればよい。ネットワーク１０４の具体例として、インターネットやＬＡＮやＷＡＮ、電話回線、専用デジタル回線、ＡＴＭやフレームリレー回線、通信衛星回線、ケーブルテレビ回線、データ放送用無線回線等のいずれか、またはこれらの組み合わせが挙げられる。 This network 104 only needs to be able to transmit and receive data. Specific examples of the network 104 include the Internet, a LAN, a WAN, a telephone line, a dedicated digital line, an ATM, a frame relay line, a communication satellite line, a cable TV line, a data broadcasting wireless line, or a combination thereof. It is done.

また、クライアントＰＣ１０１、プロキシサーバ１０３の各種端末はそれぞれ、汎用コンピュータに搭載される標準的な構成要素を有している。この構成要素の具体例として、ＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスク、外部記憶装置、ネットワークインタフェース、ディスプレイ、キーボード、及びマウスなどが挙げられる。 Each of the various terminals of the client PC 101 and the proxy server 103 has standard components mounted on a general-purpose computer. Specific examples of this component include a CPU, RAM, ROM, hard disk, external storage device, network interface, display, keyboard, and mouse.

図２は、図１におけるＭＦＰ１００の概略構成を示す図である。 FIG. 2 is a diagram showing a schematic configuration of MFP 100 in FIG.

図２において、ＭＦＰ１００は、ＣＰＵ１１７、記憶部１１１、表示部１１６、操作部１１３、画像読み取り部１１０、印刷部１１２、データ処理部１１５、及びネットワークインタフェース１１４で構成される。 2, the MFP 100 includes a CPU 117, a storage unit 111, a display unit 116, an operation unit 113, an image reading unit 110, a printing unit 112, a data processing unit 115, and a network interface 114.

ＣＰＵ１１７は、ＭＦＰ１００全体を制御する。記憶部１１１は、ＲＯＭ、ＲＡＭ、ＨＤＤなどで構成される。ＲＯＭにはブートプログラムなどのプログラムが記憶される。ＲＡＭには、画像やプログラムが展開されたり、ワークエリアとして用いられる。ＨＤＤには、プログラム、画像、データベースなどが記憶される。 CPU 117 controls MFP 100 as a whole. The storage unit 111 includes a ROM, a RAM, an HDD, and the like. A program such as a boot program is stored in the ROM. In the RAM, images and programs are developed or used as a work area. The HDD stores programs, images, databases, and the like.

表示部１１６は、操作入力の状態表示、及び処理中の画像など、ユーザに対して情報を表示する。操作部１１３は、ユーザが操作するためのキーやボタンなどで構成される。また、表示部１１６にタッチパネルが設けられている場合には、そのタッチパネルも操作部１１３を構成する。 The display unit 116 displays information to the user, such as an operation input status display and an image being processed. The operation unit 113 includes keys and buttons that are operated by the user. When the display unit 116 is provided with a touch panel, the touch panel also configures the operation unit 113.

データ処理部１１５は、信号処理などのデータ処理を行う。ネットワークインタフェース１１４は、ＬＡＮ１０２に接続するためのインタフェースである。 The data processing unit 115 performs data processing such as signal processing. The network interface 114 is an interface for connecting to the LAN 102.

画像読み取り部１１０は、オートドキュメントフィーダ（ＡＤＦ）を含み、原稿を光源で照射し、原稿反射像をレンズで固体撮像素子上に結像する。そして、画像読み取り部１１０は、固体撮像素子からラスタ状の画像読取信号を所定密度（例えば、６００ＤＰＩ）の画像として取得する。 The image reading unit 110 includes an auto document feeder (ADF), irradiates a document with a light source, and forms a document reflection image on a solid-state image sensor with a lens. Then, the image reading unit 110 acquires a raster-like image reading signal from the solid-state imaging device as an image having a predetermined density (for example, 600 DPI).

印刷部１１２は、記録媒体に画像を印刷する。例えば、印刷部１１２は、上記画像読取信号に対応する画像などを記録媒体に印刷する。１つの原稿画像を複写する場合には、画像読み取り部１１０から得られた画像読取信号をデータ処理部１１５で画像処理して記録信号を生成し、これを印刷部１１２によって記録媒体上に印刷させる。 The printing unit 112 prints an image on a recording medium. For example, the printing unit 112 prints an image corresponding to the image reading signal on a recording medium. When copying one original image, the image reading signal obtained from the image reading unit 110 is image-processed by the data processing unit 115 to generate a recording signal, which is printed on the recording medium by the printing unit 112. .

一方、複数の原稿画像を複写する場合には、一時的に記憶部１１１に１ページ分の記録信号を記憶させた後、これを印刷部１１２に出力するという処理を順次繰り返して、記録媒体上に画像を印刷させる。 On the other hand, when copying a plurality of document images, a process of temporarily storing a recording signal for one page in the storage unit 111 and then outputting the recording signal to the printing unit 112 is sequentially repeated on the recording medium. To print the image.

また、印刷部１１２は、クライアントＰＣ１０１から出力されて、ネットワークインタフェース１１４により受信された印刷データを印刷する場合には、データ処理部１１５により処理されたラスタデータを用いて記録媒体に画像を印刷する。 Further, when printing the print data output from the client PC 101 and received by the network interface 114, the printing unit 112 prints an image on a recording medium using the raster data processed by the data processing unit 115. .

さらに、ＭＦＰ１００は、画像をネットワークインタフェース１１４を介して送信する機能を有する。 Further, the MFP 100 has a function of transmitting an image via the network interface 114.

送信時には、画像読み取り部１１０で取得された画像をデータ処理部１１５によって、ＴＩＦＦやＪＰＥＧ等の圧縮画像ファイル形式、あるいはＰＤＦ等のベクトルデータファイル形式の画像ファイルへと変換し、ネットワークインタフェース１１４から出力する。 At the time of transmission, the image acquired by the image reading unit 110 is converted by the data processing unit 115 into an image file in a compressed image file format such as TIFF or JPEG or a vector data file format such as PDF and output from the network interface 114. To do.

出力された画像は、ＬＡＮ１０２を介してクライアントＰＣ１０１へ送信されたり、更にネットワーク１０４経由で外部端末（例えば、他のＭＦＰやクライアントＰＣ）に転送されたりする。 The output image is transmitted to the client PC 101 via the LAN 102, and further transferred to an external terminal (for example, another MFP or client PC) via the network 104.

本実施の形態ではＭＦＰ１００を例に説明しているが、原稿を読み取り可能なスキャナ装置に本実施の形態を適用してもよい。 Although the MFP 100 is described as an example in the present embodiment, the present embodiment may be applied to a scanner device that can read a document.

図３は、図２におけるＣＰＵ１１７により実行される画像生成処理の手順を示すフローチャートである。なお、ＣＰＵ１１７は、記憶部１１１に記憶されたプログラムを読み出して実行することによって、図３のフローチャートに示す処理を行う。 FIG. 3 is a flowchart showing a procedure of image generation processing executed by the CPU 117 in FIG. The CPU 117 performs the processing shown in the flowchart of FIG. 3 by reading and executing the program stored in the storage unit 111.

図３において、ＣＰＵ１１７は、画像読み取り部１１０に、原稿の１面を読み取らせ、１ページの画像を取得する。そして、ＣＰＵ１１７は、取得された画像から、データ処理部１１５により、前景画像を抽出する（ステップＳ１０１）。図６を例にすると、図６（Ｂ）に示される前景画像が抽出される。なお、取得された元画像は、前景画像と背景画像とに分けることができる。前景画像は、ベクトルデータや切り出される画像（文字、線画、自然画、表などの領域（オブジェクト））である。背景画像は、元画像から、前景画像が存在する領域の画素情報を消去することによって生成される画像である。図６の（Ａ）は元画像を示し、（Ｂ）は、元画像から前景画像が抽出された後の背景画像を示している。 In FIG. 3, the CPU 117 causes the image reading unit 110 to read one side of a document, and acquires one page image. Then, the CPU 117 extracts a foreground image from the acquired image by the data processing unit 115 (step S101). Taking FIG. 6 as an example, the foreground image shown in FIG. 6B is extracted. The acquired original image can be divided into a foreground image and a background image. The foreground image is vector data or an image to be cut out (an area (object) such as a character, line drawing, natural image, or table). The background image is an image generated by deleting pixel information of an area where the foreground image exists from the original image. 6A shows the original image, and FIG. 6B shows the background image after the foreground image is extracted from the original image.

次いで、ＣＰＵ１１７は、生成される画像に背景画像を付加するか否かを判別する（ステップＳ１０２）。ここでは、例えばユーザにより、生成される画像に背景画像を付加するか否かが設定されており、その設定に従ってＣＰＵ１１７は、生成される画像に背景画像を付加するか否かを判別する。 Next, the CPU 117 determines whether or not to add a background image to the generated image (step S102). Here, for example, whether or not a background image is added to the generated image is set by the user, and the CPU 117 determines whether or not to add a background image to the generated image according to the setting.

ステップＳ１０２の判別の結果、背景画像を付加する場合（ステップＳ１０２でＹＥＳ）、ＣＰＵ１１７は、前景画像及び背景画像が付加された画像を生成し（ステップＳ１０６）、本処理を終了する。 If a background image is added as a result of the determination in step S102 (YES in step S102), the CPU 117 generates an image to which the foreground image and the background image are added (step S106), and ends this process.

一方、ステップＳ１０２の判別の結果、背景画像を付加しない場合（ステップＳ１０２でＮＯ）、ＣＰＵ１１７は、抽出された前景画像があるか否か判別する（ステップＳ１０３）。 On the other hand, if no background image is added as a result of the determination in step S102 (NO in step S102), the CPU 117 determines whether there is an extracted foreground image (step S103).

ステップＳ１０３の判別の結果、抽出された前景画像がある場合（ステップＳ１０３でＹＥＳ）、ＣＰＵ１１７は、前景画像のみの画像を生成し（ステップＳ１０５）、本処理を終了する。 As a result of the determination in step S103, if there is an extracted foreground image (YES in step S103), the CPU 117 generates an image of only the foreground image (step S105), and ends this process.

一方、ステップＳ１０３の判別の結果、抽出された前景画像がない場合（ステップＳ１０３でＮＯ）、ＣＰＵ１１７は、白紙画像（前景画像の抽出ができなかったことを示す画像）を生成し（ステップＳ１０４）、本処理を終了する。 On the other hand, if the result of determination in step S103 is that there is no extracted foreground image (NO in step S103), the CPU 117 generates a blank image (an image indicating that the foreground image could not be extracted) (step S104). This process is terminated.

なお、Ｓ１０４、Ｓ１０５、Ｓ１０６で生成される画像は、それぞれ１ページのページ画像として生成される。原稿が複数枚ある場合や、原稿の両面を読み取る場合のように、画像が複数取得される場合、ＣＰＵ１１７は、原稿の読取面の数分、上記処理を繰り返し、各々の画像に対応する前景を含む画像、または白紙画像を生成する。ここで、ＣＰＵ１１７は、各々の画像に対応する前景を含む画像、または白紙画像を、予め定められた順番に並べた１つの文書データ（画像データ）を生成する。なお文書データは、ＣＰＵ１１７によって、ＴＩＦＦやＪＰＥＧ等の圧縮画像ファイル形式、あるいはＰＤＦ等のベクトルデータファイル形式の画像ファイルにファイル化されてもよい。 The images generated in S104, S105, and S106 are each generated as one page image. When there are a plurality of documents or when a plurality of images are acquired, such as when both sides of the document are read, the CPU 117 repeats the above process for the number of reading sides of the document, and displays the foreground corresponding to each image. An image including or a blank image is generated. Here, the CPU 117 generates one document data (image data) in which images including foreground corresponding to each image or blank images are arranged in a predetermined order. The document data may be filed by the CPU 117 into an image file in a compressed image file format such as TIFF or JPEG or a vector data file format such as PDF.

この予め定められた順番は、取得画像が取得された順番とすればよい。この順番は、文書データのページの順番となる。 The predetermined order may be the order in which the acquired images are acquired. This order is the order of the pages of the document data.

また、上記取得画像が取得された順番についてであるが、例えば、複数の原稿が読み取られることで、読み取られた順番に従って画像が取得されるので、この順番が例として挙げられる。 In addition, regarding the order in which the acquired images are acquired, for example, when a plurality of originals are read, images are acquired in accordance with the read order. This order is given as an example.

また、図３のフローチャートに示す処理は、画像読み取り部１１０によって読み取ってＭＦＰ１００の記憶部１１１に記憶されている画像に対して、操作部１１３からの画像生成処理（前景抽出処理）の実行指示を受け付けた場合に行われてもよい。このとき、上述した予め定められた順番は、生成手段により前景を含む画像と、前景画像を抽出できなかったことを示す画像が生成された順番とすればよい。例えば、複数ページの文書データが記憶部１１１に記憶されており、その文書データに対して画像生成処理（前景抽出処理）の実行が指示された場合に、ＣＰＵ１１７は、その文書データに対して図３のフローチャートに示す処理を実行する。そのとき、元の文書データのページ順に画像生成処理が実行されるので、新たに生成される文書データのページ順は、生成手段により前景を含む画像と、前景画像を抽出できなかったことを示す画像が生成される順番とすればよい。 3 is an instruction to execute an image generation process (foreground extraction process) from the operation unit 113 on the image read by the image reading unit 110 and stored in the storage unit 111 of the MFP 100. It may be done when accepted. At this time, the predetermined order described above may be the order in which the image including the foreground and the image indicating that the foreground image could not be extracted are generated by the generation unit. For example, when multiple pages of document data are stored in the storage unit 111 and execution of an image generation process (foreground extraction process) is instructed for the document data, the CPU 117 performs a diagram for the document data. 3 is executed. At that time, since the image generation processing is executed in the page order of the original document data, the page order of the newly generated document data indicates that the image including the foreground and the foreground image could not be extracted by the generation unit. The order in which the images are generated may be used.

図３の画像生成処理は、画像読み取り部１１０により画像が取得されるごと、または記憶部１１１に記憶された元の文書データに含まれる各ページごとに実行される。このように前景画像が抽出されないときは、白紙画像を生成することで、ページ抜けを防止することができる。 The image generation process of FIG. 3 is executed every time an image is acquired by the image reading unit 110 or for each page included in the original document data stored in the storage unit 111. When the foreground image is not extracted in this way, a blank page image is generated to prevent missing pages.

具体的に、ステップＳ１０４では、前景画像が抽出されず、かつ取得画像の背景画像を、生成する画像に付加しない場合には、前景画像が抽出されなかったことを示す白紙画像が生成されるので、ページ抜けを防止することができる。また、ステップＳ１０５，１０６では、前景画像が抽出された場合には、前景画像を含む画像が生成される。前景画像を含む画像は、前景画像のみが付加された画像、または前景画像、背景画像が付加された画像である。このように、前景画像がない画像が存在しても白紙画像が生成されるので、前景画像がない画像が存在してもページ抜けが生じないこととなる。 Specifically, in step S104, if the foreground image is not extracted and the background image of the acquired image is not added to the image to be generated, a blank image indicating that the foreground image has not been extracted is generated. , It is possible to prevent missing pages. In steps S105 and S106, if a foreground image is extracted, an image including the foreground image is generated. The image including the foreground image is an image to which only the foreground image is added, or an image to which the foreground image and the background image are added. As described above, a blank image is generated even if there is an image without a foreground image. Therefore, even if an image without a foreground image exists, no page loss occurs.

図４は、図２におけるＣＰＵ１１７により実行される画像生成処理の変形例の手順を示すフローチャートである。 FIG. 4 is a flowchart showing the procedure of a modification of the image generation process executed by the CPU 117 in FIG.

図４において、図３のステップと同じ処理を実行するステップには、同じ番号を用いているので、図３と異なる点は、ステップＳ２０４である。 In FIG. 4, since the same number is used for the step which performs the same process as the step of FIG. 3, a different point from FIG. 3 is step S204.

そこで、ステップＳ１０３の判別の結果、抽出された前景画像がない場合（ステップＳ１０３でＮＯ）、ＣＰＵＩ１７は、前景画像の抽出ができなかったことを示す画像として告知画像を生成し（ステップＳ２０４）、本処理を終了する。 Therefore, if the result of determination in step S103 is that there is no extracted foreground image (NO in step S103), the CPU I 17 generates a notification image as an image indicating that the foreground image could not be extracted (step S204). This process ends.

図５は、図４における告知画像６００を示す図である。 FIG. 5 is a diagram showing the notification image 600 in FIG.

図５において、告知画像６００には、「このページは文字データがありません。」というメッセージが示され、元画像データうちの対応するページから前景画像が抽出されなかったことを告知するようになっている。これにより、元の画像が白紙画像だったのか、背景画像のみの画像であったのかが区別可能となる。 In FIG. 5, the notification image 600 displays a message “This page has no character data.” And notifies that the foreground image has not been extracted from the corresponding page of the original image data. Yes. This makes it possible to distinguish whether the original image was a blank image or only an image of the background.

さらに、ＣＰＵ１１７は、リンク６００１を含む告知画像６００を生成するようにしてもよい。ＣＰＵ１１７は、前景画像が抽出されて前景画像が除かれた背景画像のみのページを、リンク６００１が指定された場合にのみ参照することができるページ（参考画像）として、生成される画像の最終ページ以降に付加しておく。それによって、そのリンク６００１が指定された場合に、その背景画像のみのページが表示される。 Further, the CPU 117 may generate the notification image 600 including the link 6001. The CPU 117 uses the page of only the background image from which the foreground image is extracted and the foreground image is removed as a page (reference image) that can be referred to only when the link 6001 is designated, and is the final page of the generated image. It is added afterwards. Thereby, when the link 6001 is designated, a page of only the background image is displayed.

図４の処理では、前景画像が抽出されずかつ取得画像の背景画像を、生成する画像に付加しない場合には、告知画像に加え、前景画像が抽出されなかった取得画像の背景画像を含む参考画像を生成し、告知画像には参考画像へのリンクを含めるようにしている。 In the process of FIG. 4, when the foreground image is not extracted and the background image of the acquired image is not added to the generated image, the reference image includes the background image of the acquired image from which the foreground image is not extracted in addition to the notification image. An image is generated, and a link to a reference image is included in the notification image.

なお、Ｓ２０４、Ｓ１０５、Ｓ１０６で生成される画像は、それぞれ１ページのページ画像として生成される。そして、図４のフローチャートに示す処理でも、原稿が複数枚ある場合や、原稿の両面を読み取る場合のように、画像が複数取得される場合、ＣＰＵ１１７は、原稿の読取面の数分、上記処理を繰り返す。そして、ＣＰＵ１１７は、各々の画像に対応する前景を含む画像、または白紙画像を生成する。ここで、ＣＰＵ１１７は、各々の画像に対応する前景を含む画像、または白紙画像を、上述した予め定められた順番に並べた１つの文書データを生成する。なお文書データは、ＣＰＵ１１７によって、ＴＩＦＦやＪＰＥＧ等の圧縮画像ファイル形式、あるいはＰＤＦ等のベクトルデータファイル形式の画像ファイルにファイル化されてもよい。 The images generated in S204, S105, and S106 are each generated as one page image. In the processing shown in the flowchart of FIG. 4, when there are a plurality of originals or when a plurality of images are acquired as in the case of reading both sides of the original, the CPU 117 performs the above-described processing for the number of original reading surfaces. repeat. Then, the CPU 117 generates an image including a foreground corresponding to each image or a blank paper image. Here, the CPU 117 generates one document data in which an image including a foreground corresponding to each image or a blank image is arranged in the above-described predetermined order. The document data may be filed by the CPU 117 into an image file in a compressed image file format such as TIFF or JPEG or a vector data file format such as PDF.

また、上述した実施形態に示す方法で、生成された文書データは、ＣＰＵ１１７によって記憶部１１１に記憶される。この文書データは、操作部１１３または外部のクライアントＰＣ１０１からの指示を受け付けたことに従って、印刷されてもよいし、ネットワーク１０４を介して外部装置に送信されてもよい。なお、生成された文書データが印刷される場合に、ＣＰＵ１１７は、生成された画像データに含まれる参考画像以外の画像を印刷し、参考画像を印刷しないように制御してもよいし、参考画像も印刷するようにしてもよい。また、参考画像を印刷するか否かを、ユーザが、操作部１１３または外部のクライアントＰＣ１０１から設定できるようにしてもよい。また、生成された文書データが外部装置に送信される場合に、ＣＰＵ１１７は、生成された画像データに含まれる参考画像以外の画像を送信し、参考画像を送信しないように制御してもよいし、参考画像も送信するようにしてもよい。また、参考画像を送信するか否かを、ユーザが、操作部１１３または外部のクライアントＰＣ１０１から設定できるようにしてもよい。 In addition, the document data generated by the method described in the above embodiment is stored in the storage unit 111 by the CPU 117. This document data may be printed in response to receiving an instruction from the operation unit 113 or the external client PC 101, or may be transmitted to an external apparatus via the network 104. When the generated document data is printed, the CPU 117 may control to print an image other than the reference image included in the generated image data and not to print the reference image. May also be printed. Further, whether or not to print the reference image may be set by the user from the operation unit 113 or the external client PC 101. In addition, when the generated document data is transmitted to an external device, the CPU 117 may control to transmit an image other than the reference image included in the generated image data and not transmit the reference image. A reference image may also be transmitted. Further, whether or not to transmit the reference image may be set by the user from the operation unit 113 or the external client PC 101.

（他の実施の形態）
本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）をネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（又はＣＰＵやＭＰＵ等）がプログラムコードを読み出して実行する処理である。この場合、そのプログラム、及び該プログラムを記憶した記憶媒体は本発明を構成することになる。 (Other embodiments)
The present invention is also realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, etc.) of the system or apparatus reads the program code. It is a process to be executed. In this case, the program and the storage medium storing the program constitute the present invention.

１００ＭＦＰ
１１０画像読み取り部
１１５データ処理部
１１７ＣＰＵ
100 MFP
110 Image reading unit 115 Data processing unit 117 CPU

Claims

Reading means for reading a document;
Extraction means for extracting an object from an image of a document read by the reading means;
See contains the objects extracted by the extraction means, and a generating means for generating a page image that does not include the rest which has not been extracted by the extraction means,
The image processing apparatus according to claim 1, wherein the generation unit generates a page corresponding to the image even if the extraction unit cannot extract the object.

The reading unit reads a plurality of documents,
The extraction means extracts an object from each image of the plurality of documents,
The generation unit generates a page image including the object extracted by the extraction unit for each image of the plurality of documents.
The image processing apparatus according to claim 1, wherein even if an object cannot be extracted by the extraction unit, a page corresponding to an image in which the object cannot be extracted is generated.

The image processing apparatus according to claim 2, further comprising a file generation unit configured to generate a plurality of page images generated by the generation unit and a file including pages.

The image processing apparatus according to any one of claims 1 to 3, characterized in that the image of the page is blank image.

The image processing apparatus according to any one of claims 1 to 4, further comprising a transmitting means for transmitting the document data including the page image and the page has been generated by the generating means.

The image processing apparatus according to any one of claims 1 to 4, further comprising a printing means for performing printing based on the document data including the page image and the page has been generated by the generating means.

A method for controlling an image processing apparatus having reading means for reading a document,
An extraction step of extracting an object from an image of a document read by the reading unit;
The viewing including the extracted object in the extraction step, and a generating step of generating a page image that does not include the rest which has not been extracted by the extraction step,
In the generation step, a page corresponding to the image is generated even if an object cannot be extracted in the extraction step.

A program for causing a computer to execute the control method for an image processing apparatus according to claim 7 .