[go: up one dir, main page]

WO2018100230A1 - Procédé et appareils de détermination des positions d'appareils de capture d'image multidirectionnelles - Google Patents

Procédé et appareils de détermination des positions d'appareils de capture d'image multidirectionnelles Download PDF

Info

Publication number
WO2018100230A1
WO2018100230A1 PCT/FI2017/050749 FI2017050749W WO2018100230A1 WO 2018100230 A1 WO2018100230 A1 WO 2018100230A1 FI 2017050749 W FI2017050749 W FI 2017050749W WO 2018100230 A1 WO2018100230 A1 WO 2018100230A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
cameras
image capture
positions
directional image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/FI2017/050749
Other languages
English (en)
Inventor
Tinghuai Wang
Yu You
Lixin Fan
Kimmo Roimela
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of WO2018100230A1 publication Critical patent/WO2018100230A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the present specification relates to methods and apparatuses for determining positions of multi-directional image capture apparatuses.
  • Camera pose registration is an important technique used to determine positions and orientations of image capture apparatuses such as cameras.
  • this specification describes a method comprising performing image re-projection on each of a plurality of first images, wherein each first image is captured by a camera of a respective one of a plurality of multi-directional image capture apparatuses, thereby to generate a plurality of re-projected second images which are each associated with a respective virtual camera, processing the plurality of second images to generate respective positions of the virtual cameras associated with the second images, and based on the generated positions of the virtual cameras, determining a position of each of the plurality of multi-directional image capture apparatuses.
  • a plurality of second images may be generated from each first image.
  • Each of the second images may have a different viewing direction compared to each of the other second images.
  • the first images may be fisheye images.
  • the second images may be rectilinear images.
  • the processing of the plurality of second images to generate respective positions of the virtual cameras may comprise processing the second images using a structure from motion algorithm to generate the positions of the virtual cameras.
  • the determination of a position of each of the plurality of multi-directional image capture apparatuses based on the generated positions of the virtual cameras may comprise determining a position of each of the cameras of each of the plurality of multi-directional image capture apparatuses based on the generated positions of the virtual cameras, and determining the positions of each of the plurality of multi-directional image capture apparatuses based on the determined positions of the cameras.
  • the determination of a position of each of the cameras of each of the plurality of multidirectional image capture apparatuses based on the generated positions of the virtual cameras may comprise determining outliers and inliers in the generated positions of the virtual cameras, and determining the positions of each of the cameras based only on the inliers.
  • the processing of the plurality of second images may generate respective orientations of the virtual cameras, and the method may further comprise determining an orientation of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras.
  • the determination of an orientation of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras may comprise determining an orientation of each of the cameras of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras, determining the orientation of each of the plurality of multi-directional image capture apparatuses based on the determined orientations of the cameras.
  • the position of each of the plurality of multi-directional image capture apparatuses may be determined based on both the generated positions and the generated orientations of the virtual cameras.
  • the method may further comprise determining a pixel to real world distance conversion factor based on the determined positions of the cameras.
  • the method may further comprise determining an up-vector of each of the multidirectional image capture apparatuses based on the determined positions of the cameras.
  • the up-vector may be determined by determining two respective vectors between the position of one of the cameras and the positions of two other cameras, and determining the cross product of the two vectors.
  • this specification describes apparatus configured to perform any method described with reference to the first aspect.
  • this specification describes computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform any method described with reference to the first aspect.
  • this specification describes apparatus comprising at least one processor, and at least one memory including computer program code, which when executed by the at least one processor, causes the apparatus to: perform image re- projection on each of a plurality of first images, wherein each first image is captured by a camera of a respective one of a plurality of multi-directional image capture apparatuses, thereby to generate a plurality of re-projected second images which are each associated with a respective virtual camera, process the plurality of second images to generate respective positions of the virtual cameras associated with the second images, and based on the generated positions of the virtual cameras, determine a position of each of the plurality of multi-directional image capture apparatuses.
  • a plurality of second images may be generated from each first image.
  • Each of the second images may have a different viewing direction compared to each of the other second images.
  • the first images may be fisheye images.
  • the second images may be rectilinear images.
  • the processing of the plurality of second images to generate respective positions of the virtual cameras may comprise processing the second images using a structure from motion algorithm to generate the positions of the virtual cameras.
  • the determination of a position of each of the plurality of multi-directional image capture apparatuses based on the generated positions of the virtual cameras may comprise determining a position of each of the cameras of each of the plurality of multi-directional image capture apparatuses based on the generated positions of the virtual cameras, and determining the positions of each of the plurality of multi-directional image capture apparatuses based on the determined positions of the cameras.
  • the determination of a position of each of the cameras of each of the plurality of multidirectional image capture apparatuses based on the generated positions of the virtual cameras may comprise determining outliers and inliers in the generated positions of the virtual cameras, and determining the positions of each of the cameras based only on the inliers.
  • the processing of the plurality of second images may generate respective orientations of the virtual cameras, and the computer program code, when executed by the at least one processor may cause the apparatus to determine an orientation of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras.
  • the determination of an orientation of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras may comprise determining an orientation of each of the cameras of each of the plurality of multi-directional image capture apparatuses based on the generated orientations of the virtual cameras, and determining the orientation of each of the plurality of multidirectional image capture apparatuses based on the determined orientations of the cameras.
  • the position of each of the plurality of multi-directional image capture apparatuses may be determined based on both the generated positions and the generated orientations of the virtual cameras.
  • the computer program code when executed by the at least one processor, may cause the apparatus to determine a pixel to real world distance conversion factor based on the determined positions of the cameras.
  • the computer program code when executed by the at least one processor, may cause the apparatus to determine an up-vector of each of the multi-directional image capture apparatuses based on the determined positions of the cameras.
  • the up-vector may be determined by determining two respective vectors between the position of one of the cameras and the positions of two other cameras, and determining the cross product of the two vectors.
  • this specification describes a computer-readable medium having computer-readable code stored thereon, the computer readable code, when executed by at least one processor, causes performance of performing image re-projection on each of a plurality of first images, wherein each first image is captured by a camera of a respective one of a plurality of multi-directional image capture apparatuses, thereby to generate a plurality of re-projected second images which are each associated with a respective virtual camera, processing the plurality of second images to generate respective positions of the virtual cameras associated with the second images, and determining a position of each of the plurality of multi-directional image capture apparatuses based on the generated positions of the virtual cameras.
  • the computer-readable code stored on the medium of the fifth aspect may further cause performance of any of the operations described with reference to the method of the first aspect.
  • this specification describes apparatus comprising means for performing image re-projection on each of a plurality of first images, wherein each first image is captured by a camera of a respective one of a plurality of multi-directional image capture apparatuses, thereby to generate a plurality of re-projected second images which are each associated with a respective virtual camera, means for processing the plurality of second images to generate respective positions of the virtual cameras associated with the second images, and means for determining a position of each of the plurality of multidirectional image capture apparatuses based on the generated positions of the virtual cameras.
  • the apparatus of the sixth aspect may further comprise means for causing performance of any of the operations described with reference to the method of the first aspect.
  • Figure 1 illustrates an example of multiple multi-directional image capture apparatuses in an environment
  • Figure 2 illustrates an example of processing of an image captured by a multi-directional image capture apparatus to generate re-projected images
  • Figures 3A to 3C illustrate the determination of the position and orientation of a multidirectional image capture apparatus relative to a reference coordinate system
  • Figure 4 illustrates an example of the determination of an up-vector of a multi-directional image capture apparatus
  • FIG. 5 is a flowchart illustrating examples of various operations described herein.
  • Figure 6 is a schematic diagram of an example configuration of computing apparatus configured to perform various operations described herein.
  • Figure 7 illustrates an example of a computer-readable storage medium with computer readable instructions stored thereon.
  • Figure 1 illustrates a plurality of multi-directional image capture apparatuses 10 located within an environment.
  • the multi-directional image capture apparatuses 10 may, in general, be any apparatus capable of capturing images of the scene 13 from multiple different perspectives simultaneously.
  • multi-directional image capture apparatus 10 may be a 360° camera system (also known as an omnidirectional camera system or a spherical camera system).
  • 360° camera system also known as an omnidirectional camera system or a spherical camera system.
  • multidirectional image capture apparatus 10 does not necessarily have to have full angular coverage of its surroundings and may only cover a smaller field of view.
  • each multi-directional image capture apparatus 10 may comprise a plurality of cameras 11.
  • the term "camera” used herein may refer to a sub-part of a multi-directional image capture apparatus 10 which performs the capturing of images.
  • each of the plurality of cameras 11 of multi-directional image capture apparatus 10 may be facing a different direction to each of the other cameras 11 of the multi-directional image capture apparatus 10.
  • each camera 11 of a multidirectional image capture apparatus 10 may have a different field of view, thus allowing the multi-directional image capture apparatus 10 to capture images of a scene 13 from different perspectives simultaneously.
  • each multi-directional image capture apparatus 10 may be at a different location to each of the other multi-directional image capture apparatuses 10.
  • each of the plurality of multi-directional image capture apparatuses 10 may capture images of the environment (via their cameras 11) from different perspectives simultaneously.
  • a plurality of multi-directional image capture apparatuses 10 are arranged to capture images of a particular scene 13 within the environment.
  • such information may be used for any of: performing 3D reconstruction of the captured environment, 3D registration of multi-directional image capture apparatuses 10 with respect to other sensors such as LiDAR (Light Detection and Ranging) or infrared (IR) depth sensors, audio positioning of audio sources, playback of object-based audio with respect to multi-directional image capture apparatus 10 location, and presenting multi-directional image capture apparatuses positions as 'hotspots' to which a viewer can switch during virtual reality (VR) viewing.
  • sensors such as LiDAR (Light Detection and Ranging) or infrared (IR) depth sensors
  • IR infrared
  • GPS Global Positioning System
  • magnetometers and accelerometers installed in the multi-directional image capture apparatuses 10.
  • magnetometers and accelerometers installed in the multi-directional image capture apparatuses 10.
  • such instruments may be susceptible to local disturbance (e.g. magnetometers may be disturbed by a local magnetic field), so the accuracy of orientation information obtained in this way is not necessarily very high.
  • position and orientation information can be obtained by performing structure from motion (SfM) analysis on images captured by a multi-directional image capture apparatus 10.
  • SfM structure from motion
  • SfM works by determining point correspondences between images (also known as feature matching) and calculating location and orientation based on the determined point correspondences.
  • SfM analysis may be unreliable due to unreliable determination of point correspondences between images.
  • Figure 2 illustrates one of the plurality of multi-directional image capture apparatuses 10 of Figure l.
  • a camera n of the multi-directional image capture apparatus 10 may capture a first image 21.
  • the first image 21 may be an image of a scene within the field of view 20 of the camera 11.
  • the lens of the camera 11 may be a fish-eye lens and so the first image 21 may be a fish-eye image (in which the camera field of view is enlarged).
  • the method described herein may be applicable for use with lenses and resulting images of other types.
  • the camera pose registration method described herein may also be applicable to images captured by a camera with a hyperbolic mirror in which the camera optical centre coincides with the focus of the hyperbola and images captured by a camera with a parabolic mirror and an orthographic lens in which all reflected rays are parallel to the mirror axis and the orthographic lens is used to provide a focused image.
  • the first image 21 may be processed to generate one or more second images 22. More specifically, image re-projection may be performed on the first image 21 to generate one or more re-projected second images 22.
  • the first image 21 is not a rectilinear image (e.g. a fish-eye image)
  • it may be re-projected to generate one or more second images 22 which are rectilinear images (as illustrated by Figure 2).
  • the type of re- projection may be dependent on the algorithm used to analyse the second images. For instance, as is explained below, a structure from motion algorithm, which are typically used to analyse rectilinear images, may be used, in which case the re-projection may be selected so as to generate rectilinear images.
  • the re-projection may generate any type of second image, as long as the image type is compatible with the algorithm used to analyse the re-projected images.
  • Each re-projected second image 22 may be associated with a respective virtual camera.
  • a virtual camera is an imaginary camera which does not physically exist, but which corresponds to a camera which would have captured the re-projected second image 22 with which it is associated.
  • a virtual camera is defined by virtual camera parameters which represent the configuration of the virtual camera required in order to have captured to the second image 22.
  • a virtual camera can be treated as a real physical camera.
  • each virtual camera has, among other virtual camera parameters, a position and orientation which can be determined.
  • each re-projected second image 22 may have a different viewing direction compared to each of the other second images 22.
  • the virtual camera of each second image 22 may have a different orientation compared to each of the other virtual cameras.
  • the orientation of each of the virtual cameras may also be different to the orientation of the real camera 11 which captured the first image 21.
  • each virtual camera may have a smaller field of view than the real camera 11 as a result of the re-projection.
  • the virtual cameras may have overlapping fields of view with each other.
  • the orientations of the virtual cameras may be pre-set.
  • the re-projection of the first image 21 may generate second images 22 with associated virtual cameras which each have a certain pre-set orientation relative to the orientation of the real camera 11.
  • the orientation of each virtual camera may be pre-set such that it has certain yaw, pitch and roll angles relative to the real camera 11.
  • any number of second images 22 may be generated.
  • generating more second images 22 leads to less distortion in each of the second images 22, but may also increase computational complexity.
  • the precise number of second images may be chosen based on the scene/environment being captured by the multi-directional image capture apparatus 10.
  • the re-projection process described with reference to Figure 2 may be performed for a plurality of first images 21 respectively captured by a plurality of cameras 11 of the multidirectional image capture apparatus 10. Furthermore, the same process may be performed for each of a plurality of multi-directional image capture apparatuses 10 which are capturing the same general environment, e.g. the plurality of multi-directional images capture apparatuses 10 as illustrated in Figure 1. In this way, all of the first images 21 captured by a plurality of multi-directional image capture apparatuses 10 of a particular scene may be processed as described above. It will be appreciated that the first images 21 may correspond to images of a scene at a particular moment in time.
  • a first image 21 may correspond to a single video frame of a single camera 11, and all of the first images 21 may be video frames that are captured at the same moment in time.
  • FIGS 3A to 3C illustrate the process of determining the positions and orientations of a multi-directional image capture apparatus 10.
  • each arrow 31, 32, 33 represents the position and orientation of a particular element in a reference coordinate system 30.
  • the base of the arrow represents the position and the direction of the arrow represents the orientation.
  • each arrow 31 in Figure 3A represents the position and orientation of a virtual camera associated with a respective second image
  • each arrow 32 in Figure 3B represents the position and orientation of a real camera 11 (determined based on the positions and orientations of the re-projected second images 22 derived from the first image 21 captured by the real camera)
  • the arrow 33 in Figure 3C represents the position and orientation of the multi-directional image capture apparatus 10.
  • the one or more second images are processed to generate respective positions of the virtual cameras associated with the second images, the generated positions being relative to the reference coordinate system 30.
  • the processing of the one or more second images may also generate respective orientations of the virtual cameras relative to the reference coordinate system 30.
  • the processing may involve processing a plurality of the second images generated from first images captured by a plurality of different multi-directional image capture apparatuses 10.
  • the multi-directional image capture apparatuses 10 may be necessary for the multi-directional image capture apparatuses 10 to have at least partially overlapping fields of view with each other (for example, in order to allow point correspondence determination as described below).
  • each cluster 34 of arrows 31 in Figure 3A represents the virtual cameras corresponding to a single first image of a single real camera.
  • the above described processing may be performed by using a structure from motion (SfM) algorithm to determine the position and orientation of each of the virtual cameras.
  • the SfM algorithm may operate by determining point correspondences between various ones of the second images and determining the positions and orientations of the virtual cameras based on the determined point correspondences.
  • the determined point correspondences may impose certain geometric constraints on the positions and orientations of the virtual cameras, which can be used to solve a set of quadratic equations to determine the positons and orientations of the virtual cameras relative to the reference coordinate system 30.
  • the SfM process may involve any one of or any combination of the following operations: extracting images features, matching image features, estimating camera position, reconstructing 3D points, and performing bundle adjustment.
  • the position of each of the real cameras 11 relative to the reference coordinate system 30 may be determined based on the determined positions of the virtual cameras. Similarly, once the orientations of the virtual cameras have been determined, the orientation of each of the real cameras 11 relative to the reference coordinate system 30 may be determined based on the
  • the position of each real camera may be determined by averaging the positions of the virtual cameras
  • each real camera may be determined by averaging the orientation of the virtual cameras corresponding to the real camera.
  • each cluster of arrows 34 in Figure 3A may be averaged to obtain a corresponding arrow 32 in Figure 3B.
  • the above described determination of the positions of each of the real cameras 11 may further involve determining outliers and inliers in the generated positions of the virtual cameras and determining the positions of each of the real cameras 11 based only on the inliers.
  • the above mentioned averaging may involve only averaging the inlier positions. This may improve the accuracy of the determined positions of the real cameras 11.
  • the inlier and outlier determination may be performed according to:
  • d ff Mecliaii( ⁇ i% ⁇ . . . , ⁇ 3 ⁇ 4y ⁇ ) mliers TM - ⁇ ⁇ w, Vi E where Cvi uai is the set of the positions of the virtual cameras, d, is a measure of the difference between the position of a virtual camera and the median position of all of the virtual cameras, d a is the median absolute deviation (MAD), m is a threshold value below which a determined virtual camera position is considered an inlier (for example, m may be set to be 2).
  • a virtual camera may be determined to be an inlier if the difference between its position and the median position of all of the virtual cameras divided by the median absolute deviation is less than a threshold value. That is to say, for a virtual camera to be considered an inlier, the difference between its position and the median position of all of the virtual cameras must be less than a threshold number of times larger than the median absolute deviation.
  • the orientation of each real camera may be determined in the following way.
  • the orientation of each virtual camera may be represented by a rotation matrix Rv.
  • the orientation of each real camera 11 relative to the reference coordinate system 30 may be represented by a rotation matrix Ri.
  • each virtual camera may be known as this may be pre-set (as described above with reference to Figure 2), and may be represented by rotation matrix R v i.
  • the rotation matrix of each virtual camera may be used to obtain a rotation matrix for the real camera 11 according to:
  • the rotation matrix of a real camera may be determined by multiplying the rotation matrix of a virtual camera (Rv) onto the inverse of the rotation matrix representing the orientation of the virtual camera relative to the orientation of the real camera (Rvi 1 ).
  • represents the averaged Euler angles for a real camera and 0 ⁇ represents the set of Euler angles.
  • the averaged Euler angles are determined by calculating the sum of the sines of the set of Euler angles divided by the sum of the cosines of the set of Euler angles, and taking the arctangent of the ratio. 0/may then be converted back into a rotation matrix representing the final determined orientation of real camera 11.
  • unit quaternions may be used instead of Euler angles for the abovementioned process.
  • the use of unit quaternions to represent orientation is a known mathematical technique and will not be described in detail here. Briefly, quaternions q q 2 , ... qN corresponding to the virtual camera rotation matrices may be determined. Then, the quaternions may be transformed, as necessary, to ensure that they are all on the same side of the 4D hypersphere. Specifically, one representative quaternion q M is selected and the signs of any quaternions qi where the product of qM and qi is less than zero may be inverted.
  • all quaternions qi may be summed into an average quaternion q A , and q A may be normalised into a unit quaternion q A '.
  • the unit quaternion q A may represent the averaged orientation of the camera and may be converted back to other orientation representations as desired. Using unit quaternions to represent orientation may be more numerically stable than Euler angles.
  • the orientation of the multi-directional image capture apparatus 10 relative to the reference coordinate system 30 may be determined in the following way.
  • the orientation of the multi-directional image capture apparatus 10 may be represented by rotation matrix Rd ev -
  • the orientation of each real camera 11 relative to its corresponding multi-directional image capture apparatus 10 may be known, and may be represented by rotation matrix Ridev
  • the rotation matrices Ri of the real cameras 11 may be used to obtain a rotation matrix for multi-directional image capture apparatus 10 the according to:
  • the rotation matrix of a multi-direction image capture apparatus can be determined by multiplying the rotation matrix of a real camera (Rj) onto the inverse of the matrix representing the orientation of the real camera relative to the orientation of the multi-directional image capture apparatus (Ridev 1 )-
  • the position of the multi-directional image capture apparatus 10 may be determined in the following way.
  • the position of each real camera 11 relative to its corresponding multi- directional image capture apparatus 10 may be known, and may be represented by vector videv However, videv is relative to a local coordinate system of the multi-directional image capture apparatus. To obtain the position of each real camera 11 relative to its
  • uw ei may be rotated according to:
  • Rdev is the final rotation matrix of the multi-directional image capture apparatus 10 as determined above
  • v w id ev is a vector representing the position of each real camera 11 relative to the multi-directional image capture apparatus 10 relative the reference coordinate system 30.
  • the position of the multi-directional image capture apparatus 10 may be determined according to:
  • a position of the multi-directional image capture apparatus 10 may be determined by taking the difference between the position vector of a real camera 11 and the position vector of the real camera relative to the multi-directional image capture apparatus.
  • the same inlier and outlier determination and averaging process as described above may then be applied to Cdev to obtain a final position for the multi-directional image capture apparatus 10, except substituting the determined positions of the virtual cameras for the set of positions of the multi-directional image capture apparatus 10.
  • a pixel to real world distance conversion factor may be determined. This may be performed by determining the distance between a pair of real cameras 11 on a multidirectional image capture apparatus 10 in both pixels and in a real world distance (e.g. metres). The pixel distance may be determined from the determined positions of the real cameras 11 in the reference coordinate system.
  • the real world distance may be known already from known physical parameters of the multi-directional image capture apparatus 10.
  • the pixel to real world distance conversion factor may then be simply calculated by taking the ratio of the two distances. This may be further refined by calculating the factor based on multiple different pairs of real cameras 11 of the multi-directional image capture apparatus 10, determining outliers and inliers (for example, in the same way as described above), and averaging the inliers to obtain a final pixel to real world distance conversion factor.
  • the pixel to real world distance conversion factor may be denoted S P ix e i 2 meter in the present specification.
  • an up-vector of each of the multi-directional image capture apparatuses 10 may also be determined based on the determined positions of the real cameras 11. As illustrated in Figure 4, this may be performed by determining two vectors Vi and V 2 between the position of one of the real cameras 11 and the positions of two other real cameras 11. As such, the up-vector may be determined based on the positions of a group of three real cameras 11. The up-vector may be determined by determining the cross-product of Vi and V 2 in accordance with the right hand rule. As illustrated in Figure 4, V 3 is the result of the cross product of Vi and V 2 and represents the direction of the up-vector. V 3 may be normalised to obtain a unit vector representing the up-vector.
  • the up-vector of a multi-directional image capture apparatus 10 may be defined based on a group of real cameras 11 of the multi-directional image capture apparatus 10 which are, in normal use, intended to be in a plane that is perpendicular to gravity. As such, the up-vector may be another representation of the orientation of the multi-directional image capture apparatus 10. Further, if it is assumed that the multidirectional image capture apparatus 10 is placed in an orientation in which the plane of the cameras in the group is actually perpendicular to gravity, the up-vector may correspond to the real world up-vector (the vector opposite in direction to the local gravity vector). The up-vector may provide further information which can be used in 3D reconstruction of the captured environment.
  • the reference coordinate system discussed herein may not correspond exactly with the real world (for instance, the "up" direction in the reference coordinate system may not correspond with “up” direction in the real world).
  • the calculated up-vector may allow a 3D reconstruction of the captured environment to be aligned with the real world (e.g. by ensuring that the up-vector is pointing in an up direction in the 3D reconstruction).
  • a set of up-vectors may be determined for each multi-directional image capture apparatus 10 based on determining Vi, V 2 and V 3 for a plurality of different groups of three cameras. Then outliers and inliers may be determined (in the same way as above, except substituting the determined positions of the virtual cameras for the set of determined up- vectors) and a final up-vector may be determined based only on the inliers (e.g. by averaging the inliers).
  • the up-vector may be rotated to align with a known local gravity vector (which is, for instance, determined using an accelerometer forming part of, or otherwise co-located with, the multi-directional image capture apparatus 10) to determine the real world up-vector in the reference coordinate system 30 (if it is not already aligned).
  • a known local gravity vector which is, for instance, determined using an accelerometer forming part of, or otherwise co-located with, the multi-directional image capture apparatus 10) to determine the real world up-vector in the reference coordinate system 30 (if it is not already aligned).
  • the relative positions of the plurality of multi-directional image capture apparatuses may be determined according to:
  • ⁇ ⁇ 5 represents the relative positions of one of the plurality of multi-directional image capture apparatuses (apparatus/) relative to another one of the plurality of multi-directional image capture apparatuses (apparatus i).
  • O ' dev is the position of apparatus j and dev is the position of apparatus i.
  • S P i X ei 2 meter is the pixel to real world distance conversion factor.
  • a vector representing the relative position of one of the plurality of multi-directional image capture apparatuses relative to another one of the plurality of multi-directional image capture apparatuses may be determined by taking the difference between their positions. This may be divided by the pixel-to-real world distance conversion factor depending on the scale desired.
  • the positions of all of the multi-directional image capture apparatuses 10 relative to one another may be determined in the reference coordinate system 30.
  • FIG. 5 is a flowchart showing examples of operations as described herein.
  • a plurality of first images 21 which are captured by a plurality of multidirectional image capture apparatuses 10 may be received.
  • image data corresponding to the first images 21 may be received at computing apparatus 60 (see Figure 6).
  • image re-projection may be performed on each of the first images 21 to obtain one or more re-projected second images 22 corresponding to respective virtual cameras.
  • the second images 22 may be processed to obtain positions and orientations of the virtual cameras.
  • the second images 22 may be processed using a structure from motion algorithm.
  • positions and orientations of real cameras 11 may be determined based on the positions and orientations of the virtual cameras determined at operation 5.3.
  • a pixel-to-real world distance conversion factor may be determined based on the positions of the real cameras 11 determined at operation 5.4.
  • an up-vector of each multi-directional image capture apparatus 10 may be determined based on the positions of the real cameras 11 determined at operation 5.4.
  • positions and orientations of the plurality of multi-directional image capture apparatuses 10 may be determined based on the positions and orientations of the real cameras 11 determined at operation 5.4.
  • positions of the plurality of multi-directional image capture apparatuses 10 relative to each other may be determined based on the positions of the plurality of multi-directional image capture apparatuses 10 determined at operation 5.7.
  • the position of a real camera 11 as described herein may be the position of the centre of a lens of the real camera 11.
  • the position of a virtual camera may be the position of the centre of a virtual lens of the virtual camera.
  • the position of the multi-directional image capture apparatus 10 may be the centre of the multi-directional image capture apparatus (e.g. if a multi-directional image capture apparatus is spherically shaped, its position may be defined as the geometric centre of the sphere).
  • FIG. 6 is a schematic block diagram of an example configuration of computing apparatus 60, which may be configured to perform any of or any combination of the operations described herein.
  • the computing apparatus 60 may comprise memory 61, processing circuitry 62, an input 63, and an output 64.
  • the processing circuitry 62 may be of any suitable composition and may include one or more processors 62A of any suitable type or suitable combination of types.
  • the processing circuitry 62 may be a programmable processor that interprets computer program instructions and processes data.
  • the processing circuitry 62 may include plural programmable processors.
  • the processing circuitry 62 may be, for example, programmable hardware with embedded firmware.
  • the processing circuitry 62 may be termed processing means.
  • the processing circuitry 62 may alternatively or additionally include one or more Application Specific Integrated Circuits (ASICs). In some instances, processing circuitry 62 may be referred to as computing apparatus.
  • ASICs Application Specific Integrated Circuits
  • the processing circuitry 62 described with reference to Figure 6 is coupled to the memory 61 (or one or more storage devices) and is operable to read/write data to/from the memory.
  • the memory 61 may store thereon computer readable instructions 612A which, when executed by the processing circuitry 62, may cause any one of or any combination of the operations described herein to be performed.
  • the memory 61 may comprise a single memory unit or a plurality of memory units upon which the computer-readable instructions (or code) 612A is stored.
  • the memory 61 may comprise both volatile memory 611 and non-volatile memory 612.
  • the computer readable instructions 612A may be stored in the non-volatile memory 612 and may be executed by the processing circuitry 62 using the volatile memory 611 for temporary storage of data or data and instructions.
  • volatile memory examples include RAM, DRAM, and SDRAM etc.
  • non-volatile memory examples include ROM, PROM, EEPROM, flash memory, optical storage, magnetic storage, etc.
  • the memories 61 in general may be referred to as non-transitory computer readable memory media.
  • the input 63 may be configured to receive image data representing the first images 21 described herein.
  • the image data may be received, for instance, from the multi-directional image capture apparatuses 10 themselves or may be received from a storage device.
  • the output may be configured to output any of or any combination of the camera pose registration information described herein.
  • the camera pose registration information output by the computing apparatus 60 may be used for various functions as described above with reference to Figure 1.
  • Figure 7 illustrates an example of a computer-readable medium 70 with computer- readable instructions (code) stored thereon.
  • the computer-readable instructions (code) when executed by a processor, may cause any one of or any combination of the operations described above to be performed.
  • Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic.
  • the software, application logic and/or hardware may reside on memory, or any computer media.
  • the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
  • a "memory" or “computer-readable medium” may be any media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
  • references to, where relevant, "computer-readable storage medium”, “computer program product”, “tangibly embodied computer program” etc., or a “processor” or “processing circuitry” etc. should be understood to encompass not only computers having differing architectures such as single/multi-processor architectures and sequencers/parallel architectures, but also specialised circuits such as field programmable gate arrays FPGA, application specify circuits ASIC, signal processing devices and other devices.
  • References to computer program, instructions, code etc. should be understood to express software for a programmable processor firmware such as the programmable content of a hardware device as instructions for a processor or configured or configuration settings for a fixed function device, gate array, programmable logic device, etc.
  • circuitry refers to all of the following: (a) hardware- only circuit implementations (such as implementations in only analogue and/or digital circuitry) and (b) to combinations of circuits and software (and/or firmware), such as (as applicable): (i) to a combination of processor(s) or (ii) to portions of processor(s)/software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a server, to perform various functions) and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Studio Devices (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

La présente invention concerne un procédé comprenant la réalisation d'une reprojection d'images sur chaque première image d'une pluralité de premières images, chaque première image étant capturée par une caméra (11) d'un appareil respectif parmi une pluralité d'appareils de capture d'image multidirectionnelle (10), de façon à générer ainsi une pluralité de secondes images reprojetées (22) qui sont chacune associées à une caméra virtuelle respective, le traitement de la pluralité de secondes images pour générer des positions respectives des caméras virtuelles associées aux secondes images, et sur la base des positions générées des caméras virtuelles, la détermination d'une position de chaque appareil de la pluralité d'appareils de capture d'image multidirectionnelle.
PCT/FI2017/050749 2016-11-30 2017-10-31 Procédé et appareils de détermination des positions d'appareils de capture d'image multidirectionnelles Ceased WO2018100230A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB1620312.7A GB2557212A (en) 2016-11-30 2016-11-30 Methods and apparatuses for determining positions of multi-directional image capture apparatuses
GB1620312.7 2016-11-30

Publications (1)

Publication Number Publication Date
WO2018100230A1 true WO2018100230A1 (fr) 2018-06-07

Family

ID=58073525

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2017/050749 Ceased WO2018100230A1 (fr) 2016-11-30 2017-10-31 Procédé et appareils de détermination des positions d'appareils de capture d'image multidirectionnelles

Country Status (2)

Country Link
GB (1) GB2557212A (fr)
WO (1) WO2018100230A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760458A (zh) * 2022-04-28 2022-07-15 中南大学 高真实感增强现实演播室虚拟与现实相机轨迹同步的方法
CN114782556A (zh) * 2022-06-20 2022-07-22 季华实验室 相机与激光雷达的配准方法、系统及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118209087B (zh) * 2024-05-16 2024-07-23 晓智未来(成都)科技有限公司 基于摄影测量空间点与面定位校准方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012082127A1 (fr) * 2010-12-16 2012-06-21 Massachusetts Institute Of Technology Système d'imagerie pour surveillance immersive
US20140125771A1 (en) * 2012-04-02 2014-05-08 Intel Corporation Systems, methods, and computer program products for runtime adjustment of image warping parameters in a multi-camera system
US20150302561A1 (en) * 2014-04-21 2015-10-22 Texas Instruments Incorporated Method, apparatus and system for performing geometric calibration for surround view camera solution

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6980690B1 (en) * 2000-01-20 2005-12-27 Canon Kabushiki Kaisha Image processing apparatus
US9047706B1 (en) * 2013-03-13 2015-06-02 Google Inc. Aligning digital 3D models using synthetic images
US9560345B2 (en) * 2014-12-19 2017-01-31 Disney Enterprises, Inc. Camera calibration
GB2533788A (en) * 2014-12-30 2016-07-06 Nokia Technologies Oy Method for determining the position of a portable device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012082127A1 (fr) * 2010-12-16 2012-06-21 Massachusetts Institute Of Technology Système d'imagerie pour surveillance immersive
US20140125771A1 (en) * 2012-04-02 2014-05-08 Intel Corporation Systems, methods, and computer program products for runtime adjustment of image warping parameters in a multi-camera system
US20150302561A1 (en) * 2014-04-21 2015-10-22 Texas Instruments Incorporated Method, apparatus and system for performing geometric calibration for surround view camera solution

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760458A (zh) * 2022-04-28 2022-07-15 中南大学 高真实感增强现实演播室虚拟与现实相机轨迹同步的方法
CN114782556A (zh) * 2022-06-20 2022-07-22 季华实验室 相机与激光雷达的配准方法、系统及存储介质
CN114782556B (zh) * 2022-06-20 2022-09-09 季华实验室 相机与激光雷达的配准方法、系统及存储介质

Also Published As

Publication number Publication date
GB201620312D0 (en) 2017-01-11
GB2557212A (en) 2018-06-20

Similar Documents

Publication Publication Date Title
US20190012804A1 (en) Methods and apparatuses for panoramic image processing
US10977831B2 (en) Camera calibration method and apparatus based on deep learning
CN110490916B (zh) 三维对象建模方法与设备、图像处理装置及介质
US10334168B2 (en) Threshold determination in a RANSAC algorithm
WO2021139176A1 (fr) Procédé et appareil de suivi de trajectoire de piéton sur la base d'un étalonnage de caméra binoculaire, dispositif informatique et support de stockage
US10565803B2 (en) Methods and apparatuses for determining positions of multi-directional image capture apparatuses
EP3028252A1 (fr) Ajustement par faisceaux séquentiel défilant
CN113436267B (zh) 视觉惯导标定方法、装置、计算机设备和存储介质
CN114187344B (zh) 一种地图构建方法、装置及设备
GB2567245A (en) Methods and apparatuses for depth rectification processing
CN111402404B (zh) 全景图补全方法、装置、计算机可读存储介质及电子设备
CN110111364A (zh) 运动检测方法、装置、电子设备及存储介质
US20180114339A1 (en) Information processing device and method, and program
CN111402136A (zh) 全景图生成方法、装置、计算机可读存储介质及电子设备
WO2018100230A1 (fr) Procédé et appareils de détermination des positions d'appareils de capture d'image multidirectionnelles
CN108444452B (zh) 目标经纬度和拍摄装置的三维空间姿态的检测方法及装置
Mei et al. Fast central catadioptric line extraction, estimation, tracking and structure from motion
Ventura et al. Structure and motion in urban environments using upright panoramas
WO2018150086A2 (fr) Procédés et appareils pour la détermintion de positions d'appareils de capture d'image multidirectionnelle
JP2020031264A (ja) 信号処理装置、撮像装置、信号処理方法
Wang et al. A practical distortion correcting method from fisheye image to perspective projection image
JP2005275789A (ja) 三次元構造抽出方法
CN110110767A (zh) 一种图像特征优化方法、装置、终端设备及可读存储介质
JP2005063012A (ja) 全方位カメラ運動と3次元情報の復元方法とその装置及びプログラム並びにこれを記録した記録媒体
CN116757928A (zh) 一种全景图像处理方法、系统、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17875287

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17875287

Country of ref document: EP

Kind code of ref document: A1