JP2017045198A

JP2017045198A - Controller and control method

Info

Publication number: JP2017045198A
Application number: JP2015166180A
Authority: JP
Inventors: 宗亮加々谷; Muneaki Kagaya; 吉与博上村; Kiyohiro Uemura; 達也木本; Tatsuya Kimoto
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-08-25
Filing date: 2015-08-25
Publication date: 2017-03-02
Anticipated expiration: 2035-08-25
Also published as: JP6559017B2

Abstract

PROBLEM TO BE SOLVED: To provide a controller capable of outputting a command intended by a user by accurately recognizing a gesture even when the gesture gets out of an imaging range when recognizing the gesture made in a predetermined imaging range to output a command.SOLUTION: A display controller 200 outputs a command corresponding to a motion of a subject based on a position of a predetermined first part of the subject acquired from an image obtained by imaging means 100. When the position of the first part cannot be acquired from the acquired image, the position of a predetermined second part for estimating the position of the first part is acquired from the image, the position of the first part is obtained by estimating the position of the first part based on the position of the second part.SELECTED DRAWING: Figure 2

Description

本発明は、ジェスチャで表示装置にコマンドを入力することが可能な、表示画像を操作するシステムに関するものである。特にユーザが手で特定の動作をすることで、表示装置に対して直感的な操作を行うことができる制御装置及び制御方法に関するものである。 The present invention relates to a system for manipulating a display image in which a command can be input to a display device by a gesture. In particular, the present invention relates to a control device and a control method that allow a user to perform an intuitive operation on a display device by performing a specific operation by hand.

画像表示を制御するためのユーザインターフェースとして、ユーザのジェスチャを認識してジェスチャに応じたコマンドを入力する技術がある。例えば、手の動きに合わせて画面に表示されたカーソル位置を制御することができる。この技術では、カメラで撮像した動画像から、撮像範囲内における手の動きを検出してジェスチャを判別する。このユーザインタフェースによれば、従来に比べて直感的な方法でユーザの意図するコマンドをシステムに入力することが可能となる。ジェスチャ検出用のカメラの撮像範囲内でユーザがジェスチャすることで表示画面を制御するためのコマンドを入力できる。ユーザの手指を用いた直感的なコマンド入力技術としては、特許文献１に、タッチパネル面に対する指によるタッチ動作により線画像を描画（以下、マーキング）するコマンドを入力する技術が開示されている。 As a user interface for controlling image display, there is a technique for recognizing a user's gesture and inputting a command corresponding to the gesture. For example, the cursor position displayed on the screen can be controlled in accordance with the movement of the hand. In this technique, gestures are discriminated by detecting hand movements within an imaging range from a moving image captured by a camera. According to this user interface, it is possible to input a command intended by the user into the system in an intuitive manner compared to the conventional case. A command for controlling the display screen can be input by the user performing a gesture within the imaging range of the gesture detection camera. As an intuitive command input technique using a user's finger, Patent Document 1 discloses a technique for inputting a command for drawing a line image (hereinafter referred to as marking) by a finger touch operation on the touch panel surface.

特開２０１３−２０６３５０号公報JP2013-206350A 特開２０１４−１２００２３号公報Japanese Unexamined Patent Publication No. 2014-120023

マーキングのコマンド入力をジェスチャで行うことを考えた場合、例えば、手の形状を、人差し指を伸ばして、その他４本の指を閉じた状態（以下、「ポイント状態」という）にして手を動かすことが考えられる。このジェスチャにより、表示画面内のカーソルを移動させ、カーソルの軌跡に沿ってマーキングするコマンドを入力することが考えられる。しかしながら、ジェスチャ検出用のカメラ等の撮像範囲には限界がある。撮像範囲内にある人差し指の位置情報に基づいて表示画像に対してマーキングをしているときに、人差し指が撮像範囲外に移動してしまうと、マーキングが途切れてしまう。これが課題になる状況を図１７を用いて説明する。図１７（Ｃ）のような元画像を図１７（Ｂ）のように拡大表示した状態で、ユーザが所望の領域をマーキングするコマンドをジェスチャで入力する状況である。図１７（Ａ）のように、ユーザの手の動きが撮像範囲の外に出てしまうと、図１７（Ｂ）のように半円状のマーキングになってしまい、図１７（Ｃ）のように拡大表示を解除して元画像の表示に戻した場合に、不自然なマーキング結果になってしまう。 If you consider using a gesture to input marking commands, for example, move your hand with the index finger extended and the other four fingers closed (hereinafter referred to as the “point state”). Can be considered. With this gesture, it is conceivable to move a cursor in the display screen and input a command for marking along the locus of the cursor. However, there is a limit to the imaging range of a gesture detection camera or the like. When the display image is marked based on the position information of the index finger within the imaging range, if the index finger moves out of the imaging range, the marking is interrupted. A situation in which this becomes a problem will be described with reference to FIG. In this state, the user inputs a command for marking a desired region with a gesture while the original image as shown in FIG. 17C is enlarged and displayed as shown in FIG. 17B. If the movement of the user's hand goes out of the imaging range as shown in FIG. 17A, the marking becomes a semicircular marking as shown in FIG. 17B, and as shown in FIG. When the enlarged display is canceled and the original image is displayed again, an unnatural marking result is obtained.

特許文献１では、指示体（例えばカーソル）がタッチパネル外に移動した場合、ユーザの意図によるものとは解釈せずに、コマンド入力の無効化、警告、適切な範囲への誘導等を行うことで、ユーザの操作性を向上している。しかし、ジェスチャによるコマンド入力技術では、空間内に設定される撮像範囲外でジェスチャが行われてしまう状況が起こりやすいと考えられる。そのような場合でもジェスチャ入力を無効化するのではなくユーザの意図したジェスチャを正確に認識できることが望ましい。 In Patent Document 1, when an indicator (for example, a cursor) moves outside the touch panel, command input is disabled, warning, guidance to an appropriate range, etc. without being interpreted as being due to the user's intention. , User operability has been improved. However, with the command input technique using gestures, it is considered that a situation in which a gesture is performed outside the imaging range set in the space is likely to occur. In such a case, it is desirable that the gesture intended by the user can be accurately recognized instead of invalidating the gesture input.

本発明は、所定の撮像範囲内で行われるジェスチャを認識してコマンドを出力する制御装置において、ジェスチャが撮像範囲外に出てしまった場合でもジェスチャを精度良く認識しユーザの意図したコマンドを出力できるようにすることを目的とする。 The present invention provides a control device that recognizes a gesture performed within a predetermined imaging range and outputs a command, and accurately recognizes the gesture and outputs a command intended by the user even when the gesture goes out of the imaging range. The purpose is to be able to.

本発明は、撮像手段と、
前記撮像手段の撮像により得られた画像から被写体の所定の第１部位の位置を取得する取得手段と、
前記第１部位の位置に基づきコマンドを出力する出力手段と、
を備え、
前記取得手段は、前記画像から前記第１部位の位置を取得できない場合、前記画像から、前記第１部位の位置を推測するための所定の第２部位の位置を取得し、前記第２部位の位置に基づき前記第１部位の位置を推測することにより、前記第１部位の位置を取得する制御装置である。 The present invention comprises an imaging means;
Obtaining means for obtaining a position of a predetermined first part of the subject from an image obtained by imaging of the imaging means;
Output means for outputting a command based on the position of the first part;
With
When the acquisition unit cannot acquire the position of the first part from the image, the acquisition unit acquires a position of a predetermined second part for estimating the position of the first part from the image. It is a control device that acquires the position of the first part by estimating the position of the first part based on the position.

本発明は、撮像工程と、
前記撮像工程により得られた画像から被写体の所定の第１部位の位置を取得する取得工程と、
前記第１部位の位置に基づきコマンドを出力する出力工程と、
を有し、
前記取得工程では、前記画像から前記第１部位の位置を取得できない場合、前記画像から、前記第１部位の位置を推測するための所定の第２部位の位置を取得し、前記第２部位の位置に基づき前記第１部位の位置を推測することにより、前記第１部位の位置を取得する制御方法である。 The present invention includes an imaging step;
An acquisition step of acquiring a position of a predetermined first part of the subject from the image obtained by the imaging step;
An output step of outputting a command based on the position of the first part;
Have
In the obtaining step, when the position of the first part cannot be obtained from the image, a position of a predetermined second part for estimating the position of the first part is obtained from the image, and the position of the second part is obtained. In this control method, the position of the first part is acquired by estimating the position of the first part based on the position.

本発明によれば、所定の撮像範囲内で行われるジェスチャを認識してコマンドを出力する制御装置において、ジェスチャが撮像範囲外に出てしまった場合でもジェスチャを精度良く認識しユーザの意図したコマンドを出力できるようになる。 According to the present invention, in a control device that recognizes a gesture performed within a predetermined imaging range and outputs a command, even if the gesture goes out of the imaging range, the gesture is accurately recognized and the command intended by the user Can be output.

実施例１のシステム構成の概要図Overview diagram of system configuration of Embodiment 1 実施例１の表示制御装置の構成を示すブロック図1 is a block diagram illustrating a configuration of a display control apparatus according to a first embodiment. 実施例１のユーザの手の形状及び手情報を示す図The figure which shows the user's hand shape and hand information of Example 1. 実施例１のユーザの手の形状ごとに用いる位置情報を示す図The figure which shows the positional information used for every user's hand shape of Example 1. FIG. 実施例１の手の各部位の位置情報と対応するジェスチャを示すテーブルThe table which shows the gesture corresponding to the positional information on each part of the hand of Example 1. 実施例１のジェスチャ履歴と対応する画面操作を示すテーブルの一例Example of table showing screen operation corresponding to gesture history of embodiment 1 実施例１のシステム構成全体の動作を説明するためのフローチャートFlowchart for explaining the operation of the entire system configuration of the first embodiment 実施例１の表示する静止画像、及び表示画面に出力する出力画像を示す図The figure which shows the still image displayed of Example 1, and the output image output on a display screen 実施例１の表示制御装置の動作を説明するためのフローチャートFlowchart for explaining the operation of the display control apparatus according to the first embodiment. 実施例１の表示制御装置の補正処理動作を説明するためのフローチャートFlowchart for explaining the correction processing operation of the display control apparatus according to the first embodiment. 実施例１の位置情報の補正モデル情報を示すテーブルの一例を示す図The figure which shows an example of the table which shows the correction model information of the positional information of Example 1. 実施例１の撮像範囲の位置情報と表示画面の表示内容の関係を示す図The figure which shows the relationship between the positional information on the imaging range of Example 1, and the display content of a display screen 実施例１の撮像範囲の位置情報と表示画面の表示内容の関係を示す図The figure which shows the relationship between the positional information on the imaging range of Example 1, and the display content of a display screen 実施例１の撮像範囲外でのマーキング継続による効果を説明する図The figure explaining the effect by the marking continuation outside the imaging range of Example 1. 実施例２の表示制御装置の構成を示すブロック図Block diagram showing the configuration of the display control apparatus of the second embodiment 実施例２の位置情報の補正モデル情報の修正前と修正後のテーブルTable before and after correction model information correction of position information in Embodiment 2 本発明の解決すべき課題の概要を説明する図The figure explaining the outline | summary of the problem which should be solved by this invention

（実施例１）
本発明の実施例１について、説明する。
図１は、撮像装置１００、表示制御装置２００、表示装置４００からなるシステム構成
の概要図である。
撮像装置１００は、撮像範囲１０１を撮像して得られた画像データを表示制御装置２００に出力する。
表示制御装置２００は、撮像範囲１０１内の所定の被写体の所定の第１部位の位置を取得し、第１部位の位置に基づきコマンドを出力する。実施例１では、所定の被写体はユーザ１０２の手１０４であり、第１部位はユーザ１０２の手１０４の部位のうち手１０４の位置の指標となる所定の指の所定の部位である。表示制御装置２００は、第１の部位の位置の取得を複数回、繰り返して行い、第１部位の位置の履歴に基づき被写体の動きを認識し、被写体の動きに対応するコマンドを出力する。実施例１では、表示制御装置２００は、ユーザ１０２の手１０４の動き及び形状を取得し、それに基づき、ジェスチャを判定し、ユーザ１０２が入力した画像処理のコマンドを判定する。表示制御装置２００は、コマンド内容に応じて、表示装置４００に表示する画像４０１に対する画像処理を制御する。例えば、ユーザ１０２が、撮像範囲１０１内で手１０４を動かした場合、表示制御装置２００は、撮像範囲１０１と手１０４の人差し指の先端との位置関係に基づき、人差し指の先端の位置に対応する表示画像４０１における位置を求める。そして、表示画像４０１の当該位置にカーソル４０２を合成する画像処理を行う。 Example 1
Example 1 of the present invention will be described.
FIG. 1 is a schematic diagram of a system configuration including an imaging device 100, a display control device 200, and a display device 400.
The imaging device 100 outputs image data obtained by imaging the imaging range 101 to the display control device 200.
The display control apparatus 200 acquires the position of the predetermined first part of the predetermined subject within the imaging range 101, and outputs a command based on the position of the first part. In the first embodiment, the predetermined subject is the hand 104 of the user 102, and the first part is a predetermined part of a predetermined finger serving as an index of the position of the hand 104 among the parts of the hand 104 of the user 102. The display control apparatus 200 repeatedly obtains the position of the first part a plurality of times, recognizes the movement of the subject based on the history of the position of the first part, and outputs a command corresponding to the movement of the subject. In the first embodiment, the display control apparatus 200 acquires the movement and shape of the hand 104 of the user 102, determines a gesture based on the movement and shape, and determines an image processing command input by the user 102. The display control device 200 controls image processing on the image 401 displayed on the display device 400 according to the command content. For example, when the user 102 moves the hand 104 within the imaging range 101, the display control apparatus 200 displays the display corresponding to the position of the tip of the index finger based on the positional relationship between the imaging range 101 and the tip of the index finger of the hand 104. The position in the image 401 is obtained. Then, image processing for combining the cursor 402 at the position of the display image 401 is performed.

図２は、本発明を適用した表示制御装置２００の構成例を示すブロック図である。
表示制御装置２００は、データ保存部２０１、画像入力部２０２、表示制御部２０３、撮像画像入力部２０４、特定部２０５、位置情報決定部２０６、位置情報補正部２０７、ジェスチャ判定部２０８、補正モデル保持部２０９を備える。また、表示制御装置２００は、不図示のＣＰＵ、ＲＯＭ、及びＲＡＭを備える。ＣＰＵは、ＲＯＭに格納されたプログラムに従い、ＲＡＭをワークメモリとし、タイマーを使用して時間管理を行い、表示制御装置２００全体の動作を制御する。 FIG. 2 is a block diagram illustrating a configuration example of the display control apparatus 200 to which the present invention is applied.
The display control apparatus 200 includes a data storage unit 201, an image input unit 202, a display control unit 203, a captured image input unit 204, a specifying unit 205, a position information determination unit 206, a position information correction unit 207, a gesture determination unit 208, a correction model. A holding unit 209 is provided. The display control apparatus 200 includes a CPU, a ROM, and a RAM (not shown). In accordance with a program stored in the ROM, the CPU uses the RAM as a work memory, performs time management using a timer, and controls the operation of the entire display control apparatus 200.

データ保存部２０１は、表示装置４００に表示する静止画像を保存する。
画像入力部２０２は、データ保存部２０１から複数の静止画像を入力し、画像順番等のインデックス情報を静止画像に付加し、表示制御部２０３へ静止画像を出力する。
表示制御部２０３は、入力された静止画像にＧＵＩ（Graphical User Interface）を構成する画像部品（画像切替えのためのボタンやカーソル等）を合成描画した上で、出力画像を生成し、出力画像を表示装置４００に出力する。
撮像画像入力部２０４は、撮像装置１００から撮像によって得られた撮像画像を入力し、特定部２０５へ撮像画像を出力する。 The data storage unit 201 stores a still image to be displayed on the display device 400.
The image input unit 202 inputs a plurality of still images from the data storage unit 201, adds index information such as an image order to the still image, and outputs the still image to the display control unit 203.
The display control unit 203 synthesizes and draws image components (buttons and cursors for image switching, etc.) constituting a GUI (Graphical User Interface) on the input still image, generates an output image, and outputs the output image. The data is output to the display device 400.
The captured image input unit 204 inputs a captured image obtained by imaging from the imaging apparatus 100 and outputs the captured image to the specifying unit 205.

特定部２０５は、入力された撮像画像内に含まれる手と指を特定し、手情報ｈａｎｄを生成する。特定部２０５は、位置情報決定部２０６へ撮像画像及び手情報ｈａｎｄを出力する。ここで、手情報ｈａｎｄとは、少なくとも、異なる手の形状を判別できる情報とする。例えば、図３（Ａ）は握った形状（クローズ状態）の手に対応する手情報ｈａｎｄ、図３（Ｂ）は人差し指を伸ばした形状（ポイント状態）の手に対応する手情報ｈａｎｄを示す。図３の例では、手情報ｈａｎｄは、手、親指、人差し指、中指、薬指、小指の有無を示す値の組み合わせで示される。手の形状はこれに限らない。例えば手を開いた形状（オープン状態）や、人差し指と中指を伸ばした形状等、種々の形状が考えられるがここでは記載を割愛する。 The specifying unit 205 specifies a hand and a finger included in the input captured image, and generates hand information hand. The identifying unit 205 outputs the captured image and hand information hand to the position information determining unit 206. Here, the hand information “hand” is at least information that can distinguish different hand shapes. For example, FIG. 3A shows hand information hand corresponding to a hand in a gripped shape (closed state), and FIG. 3B shows hand information hand corresponding to a hand in a shape (point state) with an index finger extended. In the example of FIG. 3, the hand information hand is indicated by a combination of values indicating the presence / absence of a hand, thumb, index finger, middle finger, ring finger, and little finger. The shape of the hand is not limited to this. For example, various shapes such as a shape in which the hand is opened (open state) and a shape in which the index finger and the middle finger are extended are conceivable, but the description is omitted here.

位置情報決定部２０６は、手情報ｈａｎｄから撮像画像内の手１０４の各部位の位置情報ｐｏｓを決定する。ここで、手１０４の各部位とは、手首、手の中心、各指の付け根、各指の先端であり、位置情報ｐｏｓはこれら各部位の位置情報を含む。例えば、図４（Ａ）はクローズ状態の手に対応する位置情報ｐｏｓ、図４（Ｂ）はポイント状態の手に対応する位置情報ｐｏｓを示す。手の一部の指が開いていないような状態や手の一部の指の先端が撮像範囲外に位置する場合、該指の先端の位置情報の値はＮＵＬＬになるものとする
。位置情報決定部２０６は、手の各部位の位置情報ｐｏｓを位置情報補正部２０７に出力する。 The position information determination unit 206 determines position information pos of each part of the hand 104 in the captured image from the hand information hand. Here, each part of the hand 104 is the wrist, the center of the hand, the base of each finger, and the tip of each finger, and the position information pos includes the position information of each part. For example, FIG. 4A shows position information pos corresponding to the hand in the closed state, and FIG. 4B shows position information pos corresponding to the hand in the point state. When some fingers of the hand are not open or when the tips of some fingers of the hand are located outside the imaging range, the position information value of the tips of the fingers is assumed to be NULL. The position information determination unit 206 outputs the position information pos of each part of the hand to the position information correction unit 207.

位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓを補正する。位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓ及びＲＡＭに保存されている前回判定したジェスチャ情報ｇｅｓ（ｔ−１）が特定の条件を満たした場合に補正を行う。位置情報補正部２０７は、予め補正モデル保持部２０９に記憶した補正モデル情報ｍｄｌを用いて補正を行う。補正モデル情報及び補正方法の詳細は後述する。 The position information correction unit 207 corrects the position information pos of each part of the hand 104. The position information correction unit 207 performs correction when the position information pos of each part of the hand 104 and the previously determined gesture information ges (t−1) stored in the RAM satisfy a specific condition. The position information correction unit 207 performs correction using the correction model information mdl stored in the correction model holding unit 209 in advance. Details of the correction model information and the correction method will be described later.

位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓに基づき、手１０４の位置の指標となる部位（指標部位）の位置情報ｉｐｏｓを求める。指標部位の位置情報ｉｐｏｓは、カーソル制御及びマーキングの制御で用いられる。例えば、カーソル制御及びマーキングのコマンド入力を行うためのジェスチャは人差し指を伸ばしたポイント状態であり、この状態の手の位置の指標となる部位は人差し指の先端である。この場合、カーソル制御及びマーキング用の指標部位の位置情報ｉｐｏｓとは、具体的にはポイント状態の手の位置情報ｐｏｓのうち、人差し指の先端の位置情報である。 Based on the position information pos of each part of the hand 104, the position information correction unit 207 obtains position information ipos of a part (index part) serving as an index of the position of the hand 104. The position information ipos of the index part is used for cursor control and marking control. For example, a gesture for performing cursor control and marking command input is a point state in which the index finger is extended, and a portion serving as an index of the hand position in this state is the tip of the index finger. In this case, the position information ipos of the index part for cursor control and marking is specifically the position information of the tip of the index finger among the position information pos of the hand in the point state.

位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓをジェスチャ判定部２０８に出力し、カーソル制御及びマーキング用の指標部位の位置情報ｉｐｏｓを表示制御部２０３に出力する。また、位置情報補正部２０７は、今回の指標部位の位置情報ｉｐｏｓ（ｔ）をＲＡＭ上に保持する。引数（ｔ）は今回取得した位置情報であることを意味する。引数（ｔ−１）は前回取得した位置情報であることを意味する。 The position information correction unit 207 outputs the position information pos of each part of the hand 104 to the gesture determination unit 208, and outputs the position information ipos of the index part for cursor control and marking to the display control unit 203. Also, the position information correction unit 207 holds the position information ipos (t) of the current index part on the RAM. The argument (t) means the position information acquired this time. The argument (t-1) means the position information acquired last time.

ジェスチャ判定部２０８は、手の各部位の位置情報ｐｏｓと手形状ジェスチャ対応テーブルとに基づき、ジェスチャｇｅｓを判定する。ここで、手形状ジェスチャ対応テーブルとは、手の各部位の位置情報ｐｏｓにおける各部位の値の有無の組み合わせと、その組み合わせに対応するジェスチャｇｅｓと、の対応関係を表すテーブルである。図５に手形状ジェスチャ対応テーブルの一例を示す。図５において、手の各部位の位置情報ｐｏｓにおける値の有無は、位置情報の値がＮＵＬＬであれば「無」、ＮＵＬＬ以外であれば「有」で示す。例えば、手の各部位の位置情報ｐｏｓにおいて、全ての部位の位置情報がＮＵＬＬ以外である場合は、ジェスチャｇｅｓは全ての指を伸ばした状態Ｏｐｅｎであると判断される。全ての指について先端部位の値が「無」の場合、手を握った状態Ｃｌｏｓｅである。人差し指のみ先端部位の値が「有」の場合、人差し指を伸ばした状態Ｐｏｉｎｔである。ジェスチャ判定部２０８は、ジェスチャ情報ｇｅｓ（ｔ）をＲＡＭに保持する。ジェスチャ判定部２０８は、表示制御部２０３へコマンド決定指示を出力する。 The gesture determination unit 208 determines a gesture gest based on the position information pos of each part of the hand and the hand shape gesture correspondence table. Here, the hand shape gesture correspondence table is a table representing a correspondence relationship between combinations of presence / absence values of each part in the position information pos of each part of the hand and gestures corresponding to the combination. FIG. 5 shows an example of the hand shape gesture correspondence table. In FIG. 5, the presence / absence of a value in the position information pos of each part of the hand is indicated as “none” if the position information value is NULL, and “present” if the position information value is other than NULL. For example, in the position information pos of each part of the hand, if the position information of all the parts is other than NULL, it is determined that the gesture “ges” is the state in which all fingers are extended. When the value of the tip portion is “none” for all the fingers, the hand is in the closed state. When the value of the tip portion of only the index finger is “present”, the index finger is in the extended state Point. The gesture determination unit 208 holds the gesture information gest (t) in the RAM. The gesture determination unit 208 outputs a command determination instruction to the display control unit 203.

表示制御部２０３は、カーソル制御部２１０、コマンド決定部２１１、描画部２１２、及び画像出力部２１３を備える。 The display control unit 203 includes a cursor control unit 210, a command determination unit 211, a drawing unit 212, and an image output unit 213.

カーソル制御部２１０は、位置情報補正部２０７から入力されたカーソル制御用の指標部位の位置情報ｉｐｏｓから、カーソル位置情報ｃｐｏｓを算出する。カーソル制御部２１０は、カーソル位置情報ｃｐｏｓ（ｔ）をＲＡＭ上に保存する。カーソル制御部２１０は、カーソル位置情報ｃｐｏｓを描画部２１２へ出力する。 The cursor control unit 210 calculates the cursor position information cpos from the position information ipos of the index part for cursor control input from the position information correction unit 207. The cursor control unit 210 stores the cursor position information cpos (t) on the RAM. The cursor control unit 210 outputs the cursor position information cpos to the drawing unit 212.

コマンド決定部２１１は、ＲＡＭに格納されたジェスチャ履歴ｇｅｓ（ｎ）とジェスチャ変換テーブルに基づき、コマンド情報ｏｐｅを求める。ジェスチャ履歴ｇｅｎ（ｎ）は、ここでは、今回（ｔ）と前回（ｔ−１）のジェスチャ判定結果ｇｅｎの組み合わせ、又は、今回（ｔ）と前回（ｔ−１）と前々回（ｔ−２）のジェスチャ判定結果ｇｅｎの組み合わせとする。ジェスチャ変換テーブルとは、ジェスチャ履歴ｇｅｓ（ｎ）に対応するコマンドを表すテーブルである。図６にジェスチャ変換テーブルの一例を示す。コマンド決
定部２１１は、コマンド情報ｏｐｅを描画部２１２へ出力する。 The command determination unit 211 obtains command information ope based on the gesture history ges (n) and the gesture conversion table stored in the RAM. Here, the gesture history gen (n) is a combination of the current (t) and the previous (t-1) gesture determination result gen, or the current (t), the previous (t-1), and the previous time (t-2). This gesture determination result gen is combined. The gesture conversion table is a table representing a command corresponding to the gesture history ges (n). FIG. 6 shows an example of the gesture conversion table. The command determination unit 211 outputs command information ope to the drawing unit 212.

描画部２１２は、静止画像、コマンド情報ｏｐｅ、及びカーソル位置情報ｃｐｏｓに基づき、マーキング等の画像を合成（重畳）し、指定の拡大率で拡大・縮小した静止画像にカーソルの画像を合成し、出力画像として画像出力部２１３へ出力する。
画像出力部２１３は、出力画像を表示装置４００へ出力し、画像表示４０１が得られる。 The drawing unit 212 synthesizes (superimposes) an image such as a marking based on the still image, the command information ope, and the cursor position information cpos, and synthesizes the cursor image with the still image enlarged / reduced at a specified enlargement ratio, The output image is output to the image output unit 213.
The image output unit 213 outputs the output image to the display device 400, and an image display 401 is obtained.

次にシステム構成全体の動作について、図７に示すフローチャートを用いて説明する。
ユーザ１０２が撮像装置１００、表示制御装置２００、及び表示装置４００の電源オン操作を行うと、電源オンをトリガとして、本フローチャートの処理は開始される。 Next, the operation of the entire system configuration will be described using the flowchart shown in FIG.
When the user 102 performs a power-on operation of the imaging device 100, the display control device 200, and the display device 400, the processing of this flowchart is started with the power-on as a trigger.

Ｓ７０１において、画像入力部２０２は、データ保存部２０１から静止画像を取得する。図８に静止画像の例を示す。画像入力部２０２は、描画部２１２へ静止画像を出力する。描画部２１２は、縦１９２０×横１０８０画素からなる表示画像を生成する。描画部２１２は、表示画像を画像出力部２１３へ出力する。画像出力部２１３は、表示装置４００に表示画像を出力することで、表示装置４００において画像表示４０１が実現する。 In step S 701, the image input unit 202 acquires a still image from the data storage unit 201. FIG. 8 shows an example of a still image. The image input unit 202 outputs a still image to the drawing unit 212. The drawing unit 212 generates a display image composed of 1920 × 1080 pixels. The drawing unit 212 outputs the display image to the image output unit 213. The image output unit 213 outputs a display image to the display device 400, thereby realizing an image display 401 on the display device 400.

Ｓ７０２において、撮像装置１００は、表示制御装置２００の表示装置４００への画像表示の完了をトリガとして、撮像を開始する。撮像開始後、ユーザ１０２は、撮像範囲１０１内で手１０４によりジェスチャすることで表示制御装置２００にコマンドを入力することができる。 In step S 702, the imaging device 100 starts imaging with the completion of image display on the display device 400 of the display control device 200 as a trigger. After the start of imaging, the user 102 can input a command to the display control device 200 by gesturing with the hand 104 within the imaging range 101.

＜補正が不要な場合の処理＞
次に、撮像画像から被写体の第１部位の位置を取得できる場合の処理を説明する。具体的な実施例１の処理として、静止画像にマーキングを行うコマンドをジェスチャ入力する処理において、手・指の位置情報の補正が不要である場合について、図９に示すフローチャートを用いて説明する。撮像画像入力部２０４への撮像画像の入力をトリガとして、表示制御装置２００は本フローチャートを開始する。
ここで、撮像画像は横１６０×縦９０画素とする。ユーザ１０２は、図１２（Ａ）に示すように、手をポイント状態にしているものとする。 <Processing when correction is not required>
Next, processing when the position of the first part of the subject can be acquired from the captured image will be described. As a specific process of the first embodiment, a case where correction of hand / finger position information is not necessary in a process of inputting a command for marking a still image will be described with reference to a flowchart shown in FIG. The display control apparatus 200 starts this flowchart with the input of the captured image to the captured image input unit 204 as a trigger.
Here, the captured image is 160 × 90 pixels. As shown in FIG. 12A, the user 102 assumes that the hand is in a point state.

Ｓ９０１において、撮像画像入力部２０４は、特定部２０５へ撮像画像を出力する。特定部２０５は、入力された撮像画像に写っている手と指を特定する。特定部２０５の処理は、予めＲＯＭに記憶した手の形状、輪郭、シルエット等の情報に基づいて行われる。次に、特定部２０５は、手情報ｈａｎｄを生成する。ここでは、手１０４は図１２（Ａ）のようにポイント状態であるため、手情報ｈａｎｄは、図３（Ａ）のように、手１、親指０、人差し指１、中指０、薬指０、小指０となる。特定部２０５は、位置情報決定部２０６へ撮像画像と手情報ｈａｎｄを出力する。 In step S 901, the captured image input unit 204 outputs the captured image to the specifying unit 205. The specifying unit 205 specifies a hand and a finger shown in the input captured image. The processing of the specifying unit 205 is performed based on information such as the hand shape, contour, and silhouette stored in advance in the ROM. Next, the specifying unit 205 generates hand information hand. Here, since the hand 104 is in the point state as shown in FIG. 12A, the hand information hand is hand 1, thumb 0, index finger 1, middle finger 0, ring finger 0, little finger as shown in FIG. 3A. 0. The identifying unit 205 outputs the captured image and hand information hand to the position information determining unit 206.

Ｓ９０２において、位置情報決定部２０６は、撮像画像と手情報ｈａｎｄとから手１０４の各部位の位置情報ｐｏｓを決定する。ここでは、図４（Ｂ）に示すような手１０４の各部位の位置情報ｐｏｓが得られる。Ｓ９０１で得られた手情報ｈａｎｄ（図３（Ｂ））において人差し指以外の指の値が「０」であるため、位置情報ｐｏｓは、指の先端の位置情報については、人差し指以外の指は値を持たない（ＮＵＬＬとなる）。位置情報決定部２０６は、手１０４の各部位の位置情報ｐｏｓを位置情報補正部２０７に出力する。 In step S 902, the position information determination unit 206 determines position information pos of each part of the hand 104 from the captured image and the hand information hand. Here, position information pos of each part of the hand 104 as shown in FIG. 4B is obtained. Since the value of the finger other than the index finger is “0” in the hand information hand (FIG. 3B) obtained in S901, the position information pos is the value of the finger other than the index finger for the position information of the tip of the finger. (It becomes NULL). The position information determination unit 206 outputs the position information pos of each part of the hand 104 to the position information correction unit 207.

Ｓ９０３において、位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓを参照し、人差し指の先端の位置情報があるか判定する。ある場合（Ｓ９０３：Ｙｅｓ）、位置情報補正部２０７は、位置情報ｐｏｓの補正は不要と判断し、Ｓ９０２で取得した手１
０４の各部位の位置情報ｐｏｓをジェスチャ判定部２０８に出力する。ここでは手１０４がポイント状態であるため、位置情報補正部２０７は、位置情報ｐｏｓのうち人差し指の先端の位置情報をカーソル制御用の指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）として、カーソル制御部２１０へ出力する。位置情報補正部２０７は、指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）（ｔ）をＲＡＭ上に保持する。処理はＳ９０６に進む。なお、手１０４の各部位の位置情報ｐｏｓに人差し指の先端の位置情報がない場合（Ｓ９０３：Ｎｏ）については、後述する。 In step S 903, the position information correction unit 207 refers to the position information pos of each part of the hand 104 and determines whether there is position information on the tip of the index finger. If there is (S903: Yes), the position information correction unit 207 determines that the correction of the position information pos is unnecessary, and the hand 1 acquired in S902.
The position information pos of each part 04 is output to the gesture determination unit 208. Here, since the hand 104 is in the point state, the position information correction unit 207 uses the position information of the tip of the index finger in the position information pos as the position information ipos (xi, yi) of the index part for cursor control. Output to 210. The position information correction unit 207 holds the position information ipos (xi, yi) (t) of the index part on the RAM. The process proceeds to S906. A case where the position information pos of each part of the hand 104 does not include the position information of the tip of the index finger (S903: No) will be described later.

Ｓ９０６において、ジェスチャ判定部２０８は、手１０４の各部位の位置情報ｐｏｓと手形状ジェスチャ対応テーブルとに基づき、ジェスチャｇｅｓを判定する。ジェスチャ判定部２０８は、Ｓ９０３で位置情報補正部２０７から取得した手の各部位１０４の位置情報ｐｏｓ（図１２（Ｂ））において、人差し指のみ先端の位置情報があるため、ジェスチャｇｅｓはＰｏｉｎｔと判定する。ジェスチャ判定部２０８は、ジェスチャ情報ｇｅｓをジェスチャ履歴ｇｅｓ（ｔ）に設定し、ＲＡＭに保持する。ジェスチャ判定部２０８は、コマンド決定部２１１へジェスチャｇｅｓを出力する。 In step S 906, the gesture determination unit 208 determines a gesture gest based on the position information pos of each part of the hand 104 and the hand shape gesture correspondence table. In the position information pos (FIG. 12B) of each part 104 of the hand acquired from the position information correction unit 207 in S903, the gesture determination unit 208 determines that the gesture ges is Point because there is position information on the tip of only the index finger. To do. The gesture determination unit 208 sets the gesture information “ges” in the gesture history “ges (t)”, and holds it in the RAM. The gesture determination unit 208 outputs the gesture “ges” to the command determination unit 211.

Ｓ９０７において、コマンド決定部２１１は、ＲＡＭに保存されているジェスチャｇｅｓ（ｎ）とジェスチャ変換テーブル（図６）とに基づき、コマンド情報ｏｐｅを求める。ここでは、ジェスチャ履歴ｇｅｓ（ｔ−１）及びｇｅｓ（ｔ）がＰｏｉｎｔであったとする。この場合、コマンド決定部２１１は、コマンドをマーキング及びカーソル移動と決定する。コマンド決定部２１１は、コマンド情報ｏｐｅ（マーキング及びカーソル移動）を描画部２１２へ出力する。 In step S907, the command determination unit 211 obtains command information ope based on the gesture “ges (n)” stored in the RAM and the gesture conversion table (FIG. 6). Here, it is assumed that the gesture histories ges (t−1) and ges (t) are Point. In this case, the command determination unit 211 determines the command as marking and cursor movement. The command determination unit 211 outputs command information ope (marking and cursor movement) to the drawing unit 212.

Ｓ９０８において、カーソル制御部２１０は、位置情報補正部２０７から入力された指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）に基づき、カーソル位置情報ｃｐｏｓ（ｘｃ、ｙｃ）を算出する。ここで、カーソル制御部２１０は、撮像画像と出力画像の画素数比に基づき、ｘｃ、ｙｃを下記式より算出し、カーソル位置情報ｃｐｏｓ（ｘｃ、ｙｃ）の座標を求める。

ｘｃ＝ｘｉ×出力画像の横画素数÷撮像画像の横画素数
ｙｃ＝ｙｉ×出力画像の縦画素数÷撮像画像の縦画素数

図１２（Ｂ）に示すように、指標部位の位置情報ｉｐｏｓが（１３０、５０）である場合には、上記式よりカーソル位置情報ｃｐｏｓは（１５６０、６００）となる。カーソル制御部２１０は、カーソル位置情報ｃｐｏｓを描画部２１２へ出力する。 In S908, the cursor control unit 210 calculates the cursor position information cpos (xc, yc) based on the position information ipos (xi, yi) of the index part input from the position information correction unit 207. Here, the cursor control unit 210 calculates xc and yc from the following formula based on the pixel number ratio between the captured image and the output image, and obtains the coordinates of the cursor position information cpos (xc, yc).

xc = xi × number of horizontal pixels of output image ÷ number of horizontal pixels of captured image yc = yi × number of vertical pixels of output image ÷ number of vertical pixels of captured image

As shown in FIG. 12B, when the position information ipos of the index part is (130, 50), the cursor position information cpos is (1560, 600) from the above formula. The cursor control unit 210 outputs the cursor position information cpos to the drawing unit 212.

Ｓ９０９において、描画部２１２は、Ｓ９０７で決定されたコマンド情報ｏｐｅと、Ｓ９０８で求められたカーソル位置情報ｃｐｏｓとに基づき、描画処理を実行する。ここではコマンド情報ｏｐｅにマーキングが含まれている。そのため、描画部２１２は、ＲＡＭ上に保存されているカーソル位置情報ｃｐｏｓ（ｔ−１）と、カーソル位置情報ｃｐｏｓ（ｔ）とに基づき、これらの点を結ぶラインを静止画像に合成することによりマーキングの画像処理を行う。また、コマンド情報ｏｐｅにカーソル移動が含まれているため、描画部２１２は、カーソル位置情報ｃｐｏｓで示される位置において静止画像にカーソルを示す画像を合成する。その後、描画部２１２は、マーキング（ライン）及びカーソルを合成した静止画像を出力画像として画像出力部２１３へ出力する。 In step S909, the drawing unit 212 executes drawing processing based on the command information ope determined in step S907 and the cursor position information cpos determined in step S908. Here, marking is included in the command information ope. Therefore, the drawing unit 212 combines a line connecting these points with a still image based on the cursor position information cpos (t−1) and the cursor position information cpos (t) stored in the RAM. Performs marking image processing. Since the command information ope includes cursor movement, the drawing unit 212 synthesizes an image indicating the cursor with the still image at the position indicated by the cursor position information cpos. Thereafter, the drawing unit 212 outputs a still image obtained by combining the marking (line) and the cursor to the image output unit 213 as an output image.

Ｓ９１０において、画像出力部２１３は、出力画像を表示装置４００へ出力し、画像表示４０１が得られる。これにより、図１２（Ｃ）に示すように、元の静止画像に対し、ユーザ１０２の手１０４のジェスチャに応じたカーソル及びマーキングをすることができる。 In S910, the image output unit 213 outputs the output image to the display device 400, and an image display 401 is obtained. Thereby, as shown in FIG. 12C, the cursor and marking according to the gesture of the hand 104 of the user 102 can be performed on the original still image.

＜補正が必要な場合の処理＞
次に、撮像画像から被写体の第１部位の位置を取得できない場合の処理を説明する。この場合、表示制御装置２００は、撮像画像から、第１部位の位置を推測するための所定の第２部位の位置を取得し、第２部位の位置に基づき第１部位の位置を推測することにより、第１部位の位置を取得する。そして、推測して得られた第１部位の位置に基づくコマンドを出力する。具体的な実施例１の処理として、静止画像にマーキングを行うコマンドをジェスチャ入力する処理において、手・指の位置情報の補正が必要となる場合について図９に示すフローチャートを用いて説明する。撮像画像入力部２０４への撮像画像の入力をトリガとして、表示制御装置２００は本フローチャートを開始する。
ここで、撮像画像は横１６０×縦９０画素とする。ユーザ１０２は、図１３（Ａ）に示すように、手をポイント状態にしていて、人差し指の先端が撮像範囲外に位置しているものとする。 <Processing when correction is required>
Next, processing when the position of the first part of the subject cannot be acquired from the captured image will be described. In this case, the display control apparatus 200 acquires the position of the predetermined second part for estimating the position of the first part from the captured image, and estimates the position of the first part based on the position of the second part. Thus, the position of the first part is acquired. And the command based on the position of the 1st site | part obtained by guessing is output. As a specific process of the first embodiment, a case where correction of hand / finger position information is necessary in a process of inputting a command for marking a still image will be described with reference to a flowchart shown in FIG. The display control apparatus 200 starts this flowchart with the input of the captured image to the captured image input unit 204 as a trigger.
Here, the captured image is 160 × 90 pixels. As shown in FIG. 13A, it is assumed that the user 102 is in the point state and the tip of the index finger is located outside the imaging range.

Ｓ９０１において、撮像画像入力部２０４は、特定部２０５へ撮像画像を出力する。特定部２０５は、入力された撮像画像に基づき手情報ｈａｎｄを生成する。ここで、図１３（Ａ）のように、手１０４は実際にはポイント状態であるが、人差し指の先端が撮像範囲外にあるため、手情報ｈａｎｄは、図３（Ａ）のように手１、親指０、人差し指０、中指０、薬指０、小指０となる。特定部２０５は、位置情報決定部２０６へ撮像画像と手情報ｈａｎｄを出力する。 In step S 901, the captured image input unit 204 outputs the captured image to the specifying unit 205. The identifying unit 205 generates hand information hand based on the input captured image. Here, as shown in FIG. 13A, the hand 104 is actually in the point state, but since the tip of the index finger is outside the imaging range, the hand information hand is the hand 1 as shown in FIG. , Thumb 0, index finger 0, middle finger 0, ring finger 0, little finger 0. The identifying unit 205 outputs the captured image and hand information hand to the position information determining unit 206.

Ｓ９０２において、位置情報決定部２０６は、撮像画像と手情報ｈａｎｄとから手１０４の各部位の位置情報ｐｏｓを決定する。ここでは、図４（Ａ）に示すような手１０４の各部位の位置情報ｐｏｓが得られる。Ｓ９０１で得られた手情報ｈａｎｄ（図３（Ａ））において、全ての指の値が「０」であるため、位置情報ｐｏｓは、指の先端の位置情報については、全ての指で値を持たない（ＮＵＬＬとなる）。位置情報決定部２０６は、手１０４の各部位の位置情報ｐｏｓを位置情報補正部２０７に出力する。 In step S 902, the position information determination unit 206 determines position information pos of each part of the hand 104 from the captured image and the hand information hand. Here, position information pos of each part of the hand 104 as shown in FIG. 4A is obtained. In the hand information hand (FIG. 3A) obtained in S901, since the values of all fingers are “0”, the position information pos is the value of all fingers for the position information of the tip of the finger. Does not have (becomes NULL). The position information determination unit 206 outputs the position information pos of each part of the hand 104 to the position information correction unit 207.

Ｓ９０３において、位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓを参照し、人差し指の先端の位置情報があるか（第１部位の位置を取得できるか）判定する。ここでは、位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓを参照し、人差し指の先端の位置情報がないと判断し（Ｓ９０３：Ｎｏ）、Ｓ９０４へ進む。 In step S903, the position information correction unit 207 refers to the position information pos of each part of the hand 104 and determines whether there is position information on the tip of the index finger (can the position of the first part be acquired). Here, the position information correction unit 207 refers to the position information pos of each part of the hand 104, determines that there is no position information on the tip of the index finger (S903: No), and proceeds to S904.

Ｓ９０４において、位置情報補正部２０７は、ＲＡＭ上のジェスチャ履歴ｇｅｓ（ｔ−１）を参照し、前回のジェスチャがＰｏｉｎｔであるか判断する。前回のジェスチャがＰｏｉｎｔである場合（Ｓ９０４：Ｙｅｓ）、位置情報補正部２０７は、Ｐｏｉｎｔジェスチャが継続していると判断し、Ｓ９０５へ進む。前回のジェスチャがＰｏｉｎｔでない場合（Ｓ９０４：Ｎｏ）、処理はＳ９０６へ進む。 In step S904, the position information correction unit 207 refers to the gesture history ges (t-1) on the RAM and determines whether the previous gesture is Point. When the previous gesture is Point (S904: Yes), the position information correction unit 207 determines that the Point gesture is continued, and proceeds to S905. If the previous gesture is not Point (S904: No), the process proceeds to S906.

Ｓ９０５において、位置情報補正部２０７は、カーソル制御及びマーキング用の指標部位である人差し指先端の位置情報を求めるために、手１０４の各部位の位置情報ｐｏｓの補正処理を行う。補正処理の詳細については、後述する。ここで、手１０４の各部位の位置情報ｐｏｓの補正処理の結果、手１０４はＰｏｉｎｔ状態であり、人差し指の先端の位置情報は１６０×９０画素の撮像画像の範囲外の（１７０、６０）となるものとする。位置情報補正部２０７は、補正後の手１０４の状態はポイント状態であるため、カーソル制御用の指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）として、補正後の手１０４の各部位の位置情報ｐｏｓにおける人差し指の先端の位置情報をカーソル制御部２１０へ出力する。位置情報補正部２０７は、指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）（ｔ）をＲＡＭ上に保持する。処理はＳ９０６に進む。 In step S 905, the position information correction unit 207 performs a correction process on the position information pos of each part of the hand 104 in order to obtain position information on the tip of the index finger that is an index part for cursor control and marking. Details of the correction processing will be described later. Here, as a result of the correction processing of the position information pos of each part of the hand 104, the hand 104 is in the Point state, and the position information of the tip of the index finger is (170, 60) outside the range of the 160 × 90 pixel captured image. Shall be. The position information correction unit 207 corrects the position information pos of each part of the hand 104 after correction as the position information ipos (xi, yi) of the index part for cursor control because the corrected state of the hand 104 is the point state. The position information of the tip of the index finger is output to the cursor control unit 210. The position information correction unit 207 holds the position information ipos (xi, yi) (t) of the index part on the RAM. The process proceeds to S906.

Ｓ９０６において、ジェスチャ判定部２０８は、手１０４の各部位の位置情報ｐｏｓと手形状ジェスチャ対応テーブルとに基づき、ジェスチャｇｅｓを判定する。ジェスチャ判定部２０８は、Ｓ９０５で補正された手の各部位１０４の位置情報ｐｏｓ（図１３（Ｂ））において、人差し指のみ先端の位置情報があるため、ジェスチャｇｅｓはＰｏｉｎｔと判定する。ジェスチャ判定部２０８は、ジェスチャ情報ｇｅｓをジェスチャ履歴ｇｅｓ（ｔ）に設定し、ＲＡＭに保持する。ジェスチャ判定部２０８は、コマンド決定部２１１へジェスチャｇｅｓを出力する。 In step S 906, the gesture determination unit 208 determines a gesture gest based on the position information pos of each part of the hand 104 and the hand shape gesture correspondence table. The gesture determination unit 208 determines that the gesture “ges” is “Point” because only the index finger has position information on the tip in the position information “pos” (FIG. 13B) of each part 104 of the hand corrected in S905. The gesture determination unit 208 sets the gesture information “ges” in the gesture history “ges (t)”, and holds it in the RAM. The gesture determination unit 208 outputs the gesture “ges” to the command determination unit 211.

Ｓ９０８において、カーソル制御部２１０は、位置情報補正部２０７から入力された指標部位の位置情報ｉｐｏｓ（ｘｉ、ｙｉ）に基づき、カーソル位置情報ｃｐｏｓ（ｘｃ、ｙｃ）を算出する。ここで、カーソル制御部２１０は、撮像画像と出力画像の画素数比に基づき、ｘｃ、ｙｃを下記式より算出し、カーソル位置情報ｃｐｏｓ（ｘｃ、ｙｃ）の座標を求める。

ｘｃ＝ｘｉ×出力画像の横画素数÷撮像画像の横画素数
ｙｃ＝ｙｉ×出力画像の縦画素数÷撮像画像の縦画素数

図１３（Ｂ）に示すように、指標部位の位置情報ｉｐｏｓが（１７０、６０）である場合には、上記式よりカーソル位置情報ｃｐｏｓは（２０４０、７２０）となる。カーソル制御部２１０は、カーソル位置情報ｃｐｏｓを描画部２１２へ出力する。 In S908, the cursor control unit 210 calculates the cursor position information cpos (xc, yc) based on the position information ipos (xi, yi) of the index part input from the position information correction unit 207. Here, the cursor control unit 210 calculates xc and yc from the following formula based on the pixel number ratio between the captured image and the output image, and obtains the coordinates of the cursor position information cpos (xc, yc).

xc = xi × number of horizontal pixels of output image ÷ number of horizontal pixels of captured image yc = yi × number of vertical pixels of output image ÷ number of vertical pixels of captured image

As shown in FIG. 13B, when the position information ipos of the index part is (170, 60), the cursor position information cpos is (2040, 720) from the above formula. The cursor control unit 210 outputs the cursor position information cpos to the drawing unit 212.

Ｓ９０９において、描画部２１２は、Ｓ９０７で決定されたコマンド情報ｏｐｅと、Ｓ９０８で求められたカーソル位置情報ｃｐｏｓとに基づき、描画処理を実行する。ここではコマンド情報ｏｐｅにマーキングが含まれているため、描画部２１２は、ＲＡＭ上に保存されているカーソル位置情報ｃｐｏｓ（ｔ−１）と、カーソル位置情報ｃｐｏｓ（ｔ）とに基づき、これらの点を結ぶラインを求め、マーキングの画像処理を行う。 In step S909, the drawing unit 212 executes drawing processing based on the command information ope determined in step S907 and the cursor position information cpos determined in step S908. Here, since the marking is included in the command information ope, the drawing unit 212 determines these based on the cursor position information cpos (t−1) and the cursor position information cpos (t) stored in the RAM. A line connecting points is obtained and image processing for marking is performed.

ここでは、指標部位の位置情報ｉｐｏｓが撮像範囲外の位置（１７０、６０）であり、カーソル位置情報ｃｐｏｓが表示装置４００に表示される範囲外の位置（２０４０、７２０）であるため、描画部２１２は、静止画像にラインを合成する画像処理は行わない。しかし、ラインの情報は保持しておき、今後、表示画像の移動、拡大及び縮小の少なくともいずれかの画像処理により当該ラインが表示装置４００に表示される範囲内に入った場合に、保持したラインの情報に基づき当該ラインを表示画像に合成する画像処理を行う。これにより、不自然でないマーキングを実現する。例えば、元の静止画像が拡大表示された状態でマーキングが行われている場合、表示範囲外の位置（２０４０、７２０）にも元の静止画像のデータが存在するので、拡大表示が解除された場合には、当該位置が表示範囲内の位置になる。このような場合に、拡大表示時に表示範囲外にあったマーキングの情報を持っているので、拡大表示解除時に適切な位置にマーキングを合成することができる。 Here, since the position information ipos of the index part is a position outside the imaging range (170, 60) and the cursor position information cpos is a position outside the range displayed on the display device 400 (2040, 720), the drawing unit 212 does not perform image processing for combining a line with a still image. However, the line information is retained, and when the line is within the range displayed on the display device 400 by image processing of at least one of movement, enlargement, and reduction of the display image in the future, the retained line is retained. Based on this information, image processing for synthesizing the line with the display image is performed. Thereby, the marking which is not unnatural is implement | achieved. For example, when marking is performed in a state where the original still image is enlarged, the original still image data is also present at positions (2040, 720) outside the display range, so that the enlarged display is released. In this case, the position is a position within the display range. In such a case, since the marking information outside the display range at the time of enlarged display is held, the marking can be synthesized at an appropriate position when the enlarged display is canceled.

また、コマンド情報ｏｐｅにカーソル移動が含まれているため、描画部２１２は、カーソル位置情報ｃｐｏｓで示される位置において静止画像にカーソルを示す画像を合成する。ここでは、指標部位の位置情報ｉｐｏｓが撮像範囲外の位置（１７０、６０）であり、カーソル位置情報ｃｐｏｓが表示装置４００に表示される範囲外の位置（２０４０、７２
０）である。そのため、描画部２１２は、表示範囲外にあるカーソルの位置を示す画像を静止画像に合成する。図１３（Ｃ）の例では、ｘプラス方向の範囲外にカーソルが位置していることがユーザに分かるように、（１９２０、７２０）の位置に、ｘプラス方向を指した矢印画像を合成している。これにより、カーソルが表示範囲外にどの位置しているかユーザが認識することができる。
その後、描画部２１２は、マーキング（ライン）及び表示範囲外にあるカーソルの位置を示す画像を合成した静止画像を出力画像として画像出力部２１３へ出力する。 Since the command information ope includes cursor movement, the drawing unit 212 synthesizes an image indicating the cursor with the still image at the position indicated by the cursor position information cpos. Here, the position information ipos of the index part is a position outside the imaging range (170, 60), and the cursor position information cpos is a position outside the range displayed on the display device 400 (2040, 72).
0). Therefore, the drawing unit 212 synthesizes an image indicating the position of the cursor outside the display range with a still image. In the example of FIG. 13C, an arrow image pointing in the x plus direction is synthesized at the position (1920, 720) so that the user can see that the cursor is located outside the range in the x plus direction. ing. Thereby, the user can recognize where the cursor is located outside the display range.
After that, the drawing unit 212 outputs a still image obtained by combining the marking (line) and the image indicating the position of the cursor outside the display range to the image output unit 213 as an output image.

Ｓ９１０において、画像出力部２１３は、出力画像を表示装置４００へ出力し、画像表示４０１が得られる。
これにより、図１３（Ｃ）に示すように、ユーザ１０２がジェスチャによるコマンド入力をしている途中で、マーキングやカーソル制御の指標部位である人差し指の先端が撮像範囲外に出てしまった場合でも、マーキングが継続される。 In S910, the image output unit 213 outputs the output image to the display device 400, and an image display 401 is obtained.
As a result, as shown in FIG. 13C, even if the tip of the index finger, which is an index part for marking or cursor control, is out of the imaging range while the user 102 is inputting a command by a gesture, The marking is continued.

＜補正処理＞
次に、図１０を用いて、位置情報補正部２０７における手１０４の各部位の位置情報ｐｏｓの補正処理フローについて説明する。 <Correction process>
Next, the correction processing flow of the position information pos of each part of the hand 104 in the position information correction unit 207 will be described with reference to FIG.

Ｓ１００１において、位置情報補正部２０７は、予め補正モデル保持部２０９に記憶してある補正モデル情報ｍｄｌを取得する。補正モデル情報とは、図１１に示すような、手の状態がＰｏｉｎｔジェスチャ時の、手の各部位の位置関係を数値化したものである。図１１の例では、手首の位置を（０、０）、人差し指の先端の位置を（ｘ、ｙ）とした場合の、手の各部位の位置と人差し指の先端の位置との位置関係を補正係数（ｘ係数、ｙ係数）として表している。ここでは、補正モデル保持部２０９には、手首の回転角度に応じた複数の補正モデルが記憶されているものとする。位置情報補正部２０７は、手１０４の各部位の位置情報ｐｏｓの数値と補正モデル情報ｍｄｌの各係数とを比較し、現在の手の状態に最も近い形状モデルに基づき求められた補正モデルを使用することで、補正の精度を向上する。補正モデルは、被写体（実施例１の場合はユーザの手）の形状モデルにおける第１の部位と第２の部位との位置関係に基づき定められたものである。 In step S1001, the position information correction unit 207 acquires correction model information mdl stored in advance in the correction model holding unit 209. The correction model information is obtained by quantifying the positional relationship of each part of the hand when the hand state is a Point gesture as shown in FIG. In the example of FIG. 11, when the wrist position is (0, 0) and the index finger tip position is (x, y), the positional relationship between the position of each part of the hand and the index finger tip position is corrected. It is expressed as a coefficient (x coefficient, y coefficient). Here, it is assumed that the correction model holding unit 209 stores a plurality of correction models corresponding to the rotation angle of the wrist. The position information correction unit 207 compares the numerical value of the position information pos of each part of the hand 104 with each coefficient of the correction model information mdl, and uses the correction model obtained based on the shape model closest to the current hand state. By doing so, the accuracy of correction is improved. The correction model is determined based on the positional relationship between the first part and the second part in the shape model of the subject (the user's hand in the case of the first embodiment).

Ｓ１００２において、位置情報補正部２０７は、Ｓ１００１で取得した補正モデル情報ｍｄｌに基づき、人差し指の先端の補正位置情報を算出する。ここでは、図１１の補正モデル情報ｍｄｌを使用する例について説明する。補正モデル情報ｍｄｌを使用することで、少なくとも２点の位置情報が取得できていれば、人差し指の先端の補正位置情報を算出することができる。例えば、手首及び手の中心の２点の位置情報から人差し指の先端の補正位置情報を算出する式は以下のようになる。

ｘｐ＝（ｘｈ−ｘｎ）／ｍｈ＋ｘｎ
ｙｐ＝（ｙｈ−ｙｎ）／ｎｈ＋ｙｎ

ここで、人差し指先端の補正位置情報を（ｘｐ、ｙｐ）、手首の位置情報を（ｘｎ、ｙｎ）、手の中心の位置情報を（ｘｈ、ｙｈ）、手の中心の補正係数を（ｍｈ、ｎｈ）とする。人差し指の先端の位置が第１部位の位置であり、人差し指の先端の位置を求めるための複数（少なくとも２点）の位置が、第１部位の位置を推測するための第２部位の位置である。このように、本発明では撮像画像から第１部位の位置を取得できない場合、第２部位の位置から第１部位の位置を推測することにより取得する。推測する際には、上記のように、第１部位の位置と第２部位の位置との予め定められた対応関係である補正モデルの情報を記憶手段に記憶させておき、この対応関係に基づき第２部位の位置から第１部位の位置を推測する。 In step S1002, the position information correction unit 207 calculates correction position information of the tip of the index finger based on the correction model information mdl acquired in step S1001. Here, an example in which the correction model information mdl in FIG. 11 is used will be described. By using the correction model information mdl, the correction position information of the tip of the index finger can be calculated if position information of at least two points can be acquired. For example, the equation for calculating the corrected position information of the tip of the index finger from the position information of the wrist and the center of the hand is as follows.

xp = (xh−xn) / mh + xn
yp = (yh-yn) / nh + yn

Here, the corrected finger tip correction position information is (xp, yp), the wrist position information is (xn, yn), the hand center position information is (xh, yh), and the hand center correction coefficient is (mh, yh). nh). The position of the tip of the index finger is the position of the first part, and a plurality of (at least two points) positions for obtaining the position of the tip of the index finger are the positions of the second part for estimating the position of the first part. . Thus, in the present invention, when the position of the first part cannot be acquired from the captured image, the position of the first part is acquired from the position of the second part. When estimating, as described above, information on the correction model, which is a predetermined correspondence between the position of the first part and the position of the second part, is stored in the storage unit, and based on this correspondence The position of the first part is estimated from the position of the second part.

図１３（Ｂ）の例では、手首の位置情報が（１４０、４５）、手の中心の位置情報が（１５０、５０）であり、図１１の補正モデル情報ｍｄｌより手の中心の補正係数が（０．３３、０．３３）である。この場合、上記式より人差し指先端の補正位置情報は（１７０、６０）となる。 In the example of FIG. 13B, wrist position information is (140, 45), hand center position information is (150, 50), and the hand center correction coefficient is based on the correction model information mdl of FIG. (0.33, 0.33). In this case, the corrected position information of the tip of the index finger is (170, 60) from the above formula.

Ｓ１００３において、位置情報補正部２０７は、Ｓ１００２で算出した人差し指先端の補正位置情報（ｘｐ、ｙｐ）が撮像範囲外かどうかを判別する。ここでは、１６０×９０画素の撮像範囲に対して、人差し指先端の補正位置情報（ｘｐ、ｙｐ）は（１７０、６０）でありプラスｘ方向で撮像範囲を超えているため、位置情報補正部２０７は撮像範囲外と判断する。人差し指先端の補正位置情報（ｘｐ、ｙｐ）が撮像範囲外の場合（Ｓ１００３：Ｙｅｓ）、Ｓ１００４へ進み、それ以外の場合、このフローチャートの処理を終了する。 In step S1003, the position information correction unit 207 determines whether the corrected position information (xp, yp) of the index finger tip calculated in step S1002 is outside the imaging range. Here, the correction position information (xp, yp) of the index finger tip is (170, 60) with respect to the imaging range of 160 × 90 pixels, which exceeds the imaging range in the plus x direction. Is determined to be outside the imaging range. When the correction position information (xp, yp) of the index finger tip is outside the imaging range (S1003: Yes), the process proceeds to S1004. Otherwise, the process of this flowchart is terminated.

Ｓ１００４において、位置情報補正部２０７は、上記判断によって、撮像範囲外においてＰｏｉｎｔジェスチャが継続していると判断する。そして、手１０４の各部位の位置情報ｐｏｓの人差し指先端の位置情報を、Ｓ１００３で算出した人差し指先端の補正位置情報（ｘｐ、ｙｐ）に置き換える。結果として、人差し指先端の位置情報を（ＮＵＬＬ、ＮＵＬＬ）から（１７０、６０）に補正した手１０４の各部位の位置情報ｐｏｓが出力される。 In step S 1004, the position information correction unit 207 determines that the Point gesture is continued outside the imaging range based on the above determination. Then, the position information of the index finger tip in the position information pos of each part of the hand 104 is replaced with the corrected position information (xp, yp) of the index finger tip calculated in S1003. As a result, position information pos of each part of the hand 104 in which the position information of the index finger tip is corrected from (NULL, NULL) to (170, 60) is output.

以上のように、本発明では、撮像装置により所定の撮像範囲を撮像して得られた撮像画像から第１部位及び第２部位の位置を取得する。そして、第１部位が撮像範囲外にあり、かつ第２部位が撮像範囲内にある場合、第２部位の位置に基づき第１部位の位置を推測する。
実施例１によれば、図１４（Ａ）に示すように、ユーザのジェスチャが撮像範囲外に移動した場合においてもマーキングを途切れることなく継続することができる。表示範囲外のマーキングの情報を保持しておくことにより、移動、拡大、縮小等の画像処理により当該マーキングが表示範囲内になったときに当該マーキングを描画することができる。このように、実施例１では、ユーザ１０２が、ジェスチャ操作によるマーキングの途中で、手が撮像範囲外に出てしまった場合でも、自然な、ユーザの意図に沿ったマーキングが可能となり、操作性が向上する。 As described above, in the present invention, the positions of the first part and the second part are acquired from the captured image obtained by imaging the predetermined imaging range by the imaging device. When the first part is outside the imaging range and the second part is within the imaging range, the position of the first part is estimated based on the position of the second part.
According to the first embodiment, as shown in FIG. 14A, even when the user's gesture moves outside the imaging range, the marking can be continued without interruption. By keeping information on markings outside the display range, the markings can be drawn when the markings are within the display range by image processing such as movement, enlargement, and reduction. As described above, in the first embodiment, even when the user 102 moves out of the imaging range during the marking by the gesture operation, natural marking according to the user's intention is possible, and the operability is improved. Will improve.

上記の実施例１には、本発明の範囲内で以下のような種々の変形が可能である。以下の変形を行った実施例も本発明の範囲に含まれる。
撮像装置１００がＸ−Ｙ平面でユーザ１０２を撮像できるようにしたが、Ｘ−Ｚ平面でユーザ１０２の手１０４のみを撮像できるよう実施しても良い。その際、Ｚ座標を測定するため、撮像装置１００を２眼のステレオカメラにし、三角測定するよう実施しても良いし、もしくは、距離センサーを用いて測定するようにしても良い。 The above-described first embodiment can be modified in various ways as described below within the scope of the present invention. Examples in which the following modifications are made are also included in the scope of the present invention.
Although the imaging apparatus 100 can capture the user 102 on the XY plane, the imaging apparatus 100 may be configured to capture only the hand 104 of the user 102 on the XZ plane. At this time, in order to measure the Z coordinate, the imaging apparatus 100 may be a two-lens stereo camera and may be measured in a triangular manner, or may be measured using a distance sensor.

人差し指の先端をマーキング及びカーソル制御用の指標部位としたが、例えば手の中心等他の部位を指標部位としても良い。
位置情報を検出可能な手・指の部位として、各指の付け根、先端、手首、手の中心としたが、指の各関節や、腕の部位等、より多くの部位の位置情報を検出できるようにしても良い。 Although the tip of the index finger is used as an index part for marking and cursor control, other parts such as the center of the hand may be used as the index part.
The position of the hand / finger that can detect position information is the base, tip, wrist, and center of the hand of each finger, but it can detect position information of more parts such as finger joints and arm parts. You may do it.

補正処理として、人差し指の先端以外の任意の組み合わせから人差し指の先端の位置情報を計算する方法を例示した。しかし、人差し指の先端以外の部位に計算に使用する優先順位を設定して、優先順位の高い複数部位の位置に基づき人差し指の先端の位置を計算するようにしても良い。あるいは、人差し指の先端以外の部位の組み合わせ結果に重みづけをして、全ての組み合わせで計算した結果及び重みづけの値によって人差し指の先端の位
置情報を計算するよう実施しても良い。あるいは、上記実施例では複数（２個）の第２部位の位置に基づき第１部位の位置を推測する例を示したが、１つの第２部位の位置に基づき第１部位の位置を推測しても良い。例えば、第１部位の取得及び第２部位の取得を複数回、繰り返して行う。そして、撮像画像から第１部位の位置を取得できない回があった場合、その回における第２部位の位置及びそれ以前の回で取得した第２部位の位置の履歴と、第２部位と第１部位との位置関係に基づき、第１部位の位置を推測しても良い。具体的には、人差し指の先端以外の１点の部位の移動軌跡から人差し指の先端の位置情報を予測するようにしても良い。 As the correction process, a method of calculating the position information of the tip of the index finger from any combination other than the tip of the index finger has been exemplified. However, a priority order used for calculation may be set for a part other than the tip of the index finger, and the position of the tip of the index finger may be calculated based on the positions of a plurality of parts having a high priority. Alternatively, the combination result of parts other than the tip of the index finger may be weighted, and the position information of the tip of the index finger may be calculated based on the result of calculation for all the combinations and the weight value. Or although the example which estimated the position of the 1st part based on the position of a plurality of (2 pieces) 2nd part was shown in the above-mentioned example, the position of the 1st part is estimated based on the position of one 2nd part. May be. For example, the acquisition of the first part and the acquisition of the second part are repeated a plurality of times. When there is a time when the position of the first part cannot be acquired from the captured image, the history of the position of the second part and the position of the second part acquired in the previous time, the second part and the first part The position of the first part may be estimated based on the positional relationship with the part. Specifically, the position information of the tip of the index finger may be predicted from the movement trajectory of one part other than the tip of the index finger.

表示制御装置にジェスチャ操作検出用の撮像カメラのみ設置としたが、視線検出用のカメラを設置して、人差し指の先端以外の部位の位置情報及び視線情報から、人差し指の先端の位置情報を予測するようにしても良い。
人差し指の先端が撮像範囲外に移動した場合に、補正した位置情報を適用したが、カメラと手の前の障害物を検出する手段を持ち、障害物によって位置情報が取得できなかった場合にも補正処理によって位置情報を取得するようにしても良い。 Although only the imaging camera for detecting the gesture operation is installed in the display control device, the camera for detecting the line of sight is installed to predict the position information of the tip of the index finger from the position information and the line of sight information other than the tip of the index finger. You may do it.
The corrected position information was applied when the tip of the index finger moved out of the imaging range, but it also has a means to detect an obstacle in front of the camera and hand, and the position information could not be acquired due to the obstacle. The position information may be acquired by a correction process.

カーソル表示の形状を矢印にしたが、例えばカーソル表示を５本の指の開閉具合がわかる手形状のものとしても良い。カーソルを一つとしたが、特定した手の数だけ、前述の手形状のカーソルを表示するようにしても良い。
ユーザ１０２が操作する手１０４を片手のみとしているが、両手でジェスチャ操作するよう実施しても良い。
静止画像を対象としたが、動画像を対象とする構成で実施しても良い。
カーソル表示をジェスチャでのみ制御したが、マウスやキーボードからも制御できる構成で実施しても良い。
データ保存部２０１を表示制御装置２００内にしたが、表示制御装置２００の外に配置して、例えばＵＳＢメモリや、サーバーとＬＡＮで接続された構成で実施しても良い。
表示制御装置２００は、ユーザのジェスチャに応じた画像処理のコマンドを出力する例を説明したが、コマンドは画像処理に限らない。ユーザの指令に応じた動作を行うことが想定されたあらゆる機器に対しジェスチャによりコマンドを出力するシステムに本発明は適用できる。 Although the cursor display shape is an arrow, for example, the cursor display may be a hand shape that shows the open / close state of five fingers. Although the number of cursors is one, the above-mentioned hand-shaped cursors may be displayed for the number of specified hands.
Although the hand 104 operated by the user 102 is only one hand, the gesture operation may be performed with both hands.
Although the target is still images, it may be implemented with a configuration that targets moving images.
Although the cursor display is controlled only by the gesture, the cursor display may be controlled by a mouse or a keyboard.
Although the data storage unit 201 is included in the display control device 200, the data storage unit 201 may be disposed outside the display control device 200 and may be implemented by, for example, a USB memory or a configuration connected to a server via a LAN.
Although the display control apparatus 200 has described the example of outputting an image processing command according to the user's gesture, the command is not limited to image processing. The present invention can be applied to a system that outputs a command by gesture to any device that is supposed to perform an operation according to a user's command.

（実施例２）
本発明の実施例２について、説明する。以下、実施例２にかかる説明は、実施例１との相違点となる、ユーザ１０２の手の形状に応じた補正モデルの修正についてのみ説明する。実施例２では、実際の被写体の撮像画像から取得した第１部位の位置及び／又は第２部位の位置に基づき、第１部位の位置と第２部位の位置の対応関係を補正する例を説明する。
図１５は、本発明を適用した表示制御装置２００の構成例を示すブロック図である。
位置情報決定部２０６は、手１０４の各部位の位置情報ｐｏｓを特定後、位置情報補正部２０７に出力するとともに、補正モデル修正部３０１に手１０４の各部位の位置情報ｐｏｓを出力する。 (Example 2)
A second embodiment of the present invention will be described. In the following description of the second embodiment, only correction of the correction model in accordance with the shape of the user's 102 hand, which is different from the first embodiment, will be described. In the second embodiment, an example will be described in which the correspondence between the position of the first part and the position of the second part is corrected based on the position of the first part and / or the position of the second part acquired from the captured image of the actual subject. To do.
FIG. 15 is a block diagram illustrating a configuration example of the display control apparatus 200 to which the present invention is applied.
The position information determination unit 206 specifies the position information pos of each part of the hand 104 and then outputs the position information pos to the position information correction unit 207, and outputs the position information pos of each part of the hand 104 to the correction model correction unit 301.

補正モデル修正部３０１は、手１０４の各部位の位置情報ｐｏｓを取得すると、補正モデル保持部２０９から取得した補正モデル情報ｍｄｌに対して、現在のユーザの手の部位の位置関係を反映するような補正モデル情報ｍｄｌ’を算出して保持する。図１６（Ａ）は補正前の補正モデル情報ｍｄｌを示し、図１６（Ｂ）は補正後の補正モデル情報ｍｄｌ’を示す。
位置情報補正部２０７は、位置情報の補正が必要と判断された場合に、補正モデル修正部３０１より補正モデル情報ｍｄｌ’を取得して、位置情報の補正に利用する。
以上のように、本発明の実施例２では、ユーザ１０２の手の形状に応じて補正モデルを
修正することで、位置情報の補正精度を高めることができる。
（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 When the correction model correction unit 301 acquires the position information pos of each part of the hand 104, the correction model correction unit 301 reflects the current positional relationship of the hand part of the user to the correction model information mdl acquired from the correction model holding unit 209. Correct correction model information mdl ′ is calculated and stored. FIG. 16A shows correction model information mdl before correction, and FIG. 16B shows correction model information mdl ′ after correction.
The position information correction unit 207 acquires the correction model information mdl ′ from the correction model correction unit 301 when it is determined that the position information needs to be corrected, and uses it for correcting the position information.
As described above, in the second embodiment of the present invention, the correction accuracy of the position information can be increased by correcting the correction model according to the shape of the hand of the user 102.
(Other examples)
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００：撮像装置、１０４：手、２０４：撮像画像入力部、２０６：位置情報決定部、２０７：位置情報補正部、２１０：操作決定部 100: imaging device, 104: hand, 204: captured image input unit, 206: position information determination unit, 207: position information correction unit, 210: operation determination unit

Claims

Imaging means;
Obtaining means for obtaining a position of a predetermined first part of the subject from an image obtained by imaging of the imaging means;
Output means for outputting a command based on the position of the first part;
With
When the acquisition unit cannot acquire the position of the first part from the image, the acquisition unit acquires a position of a predetermined second part for estimating the position of the first part from the image. A control device that acquires the position of the first part by estimating the position of the first part based on the position.

The acquisition unit acquires positions of the first part and the second part from an image obtained by imaging a predetermined imaging range by the imaging unit, and the first part is outside the imaging range. 2. The control device according to claim 1, wherein when the second part is within the imaging range, the position of the first part is estimated based on the position of the second part.

The control device according to claim 1, wherein the acquisition unit estimates a position of the first part based on a plurality of positions of the second part.

Storage means for storing information on a predetermined correspondence relationship between the position of the second part and the position of the first part;
The control device according to claim 1, wherein the acquisition unit estimates the position of the first part from the position of the second part based on the correspondence.

The control device according to claim 4, wherein the correspondence relationship is determined based on a positional relationship between the first part and the second part in the shape model of the subject.

The control device according to claim 5, further comprising a correcting unit that corrects the correspondence relationship based on the position of the first part and / or the position of the second part acquired by the acquiring unit.

The acquisition means repeatedly acquires the position of the first part and the position of the second part a plurality of times, and there are times when the position of the first part cannot be acquired from the image The position of the first part based on the history of the position of the second part at that time and the history of the position of the second part acquired in the previous time and the positional relationship between the second part and the first part. The control device according to claim 1, wherein

The acquisition means repeatedly acquires the position of the first part and the position of the second part a plurality of times, the first part is outside the imaging range, and the second part Is in the imaging range, the position of the second part at that time and the history of the position of the second part acquired in the previous time, the second part and the first part, The control device according to claim 1, wherein the position of the first part is estimated based on the positional relationship.

The control device according to claim 7 or 8, wherein the acquisition unit estimates the position of the first part based on the position of one second part.

The acquisition means repeatedly acquires the position of the first part a plurality of times,
The control device according to claim 1, wherein the output unit recognizes the movement of the subject based on a history of the position of the first part and outputs a command corresponding to the movement of the subject.

The control device according to claim 1, wherein the output unit outputs an image processing command for a display image displayed on the display device.

The acquisition unit acquires the positions of the first part and the second part from an image obtained by imaging a predetermined imaging range by the imaging unit,
The control device according to claim 11, wherein the output unit obtains a position corresponding to the first part in the display image based on a positional relationship between the imaging range and the first part.

The output means outputs an image processing command for combining a cursor at a position corresponding to the first part in the display image when the position of the first part is within the imaging range. Control device.

The output means outputs an image processing command for combining an image indicating a position corresponding to the first part with the display image when the position of the first part is outside the imaging range. Control device.

The acquisition means repeatedly acquires the position of the first part a plurality of times,
The control device according to claim 12, wherein the output unit outputs an image processing command for combining lines connecting the positions of the plurality of first portions with the display image.

The output means calculates and holds information of a line outside the range displayed on the display device when a position outside the imaging range is included in the positions of the plurality of first parts, and thereafter, When the line is within the range displayed on the display device by image processing of at least one of movement, enlargement, and reduction of the display image, the line is combined with the display image based on the retained line information. The control device according to claim 15, which outputs a command for image processing to be performed.

The control device according to claim 1, wherein the subject is a user's hand, and the movement of the subject is a gesture.

The acquisition unit further acquires the shape of the user's hand from an image obtained by imaging of the imaging unit,
The control device according to claim 17, wherein the output unit outputs a command based on a shape and a gesture of a user's hand.

The control device according to claim 17 or 18, wherein the first part of the subject is a predetermined position of a predetermined finger of a user's hand.

Imaging process;
An acquisition step of acquiring a position of a predetermined first part of the subject from the image obtained by the imaging step;
An output step of outputting a command based on the position of the first part;
Have
In the obtaining step, when the position of the first part cannot be obtained from the image, a position of a predetermined second part for estimating the position of the first part is obtained from the image, and the position of the second part is obtained. A control method for acquiring the position of the first part by estimating the position of the first part based on the position.