CN119421046A

CN119421046A - Photographing method, device and electronic device

Info

Publication number: CN119421046A
Application number: CN202411565562.0A
Authority: CN
Inventors: 陈帅
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2024-11-04
Filing date: 2024-11-04
Publication date: 2025-02-11

Abstract

The present application discloses a photographing method, device and electronic device, belonging to the technical field of electronic devices. The method comprises: receiving a first input from a user while displaying a preview image; imaging the preview image in response to the first input to obtain a first image, wherein the first image includes a first object and M second objects; collecting video clips within a first time period after receiving the first input to obtain N frames of second images, where M and N are both positive integers; extracting a first pixel position set of the M second objects in the first image and background pixels at positions corresponding to the M second objects in the N frames of the second image; pixel-filling the first pixel position set using background pixels at positions corresponding to the M second objects in the N frames of the second image to obtain a target image.

Description

Photographing method and device and electronic equipment

Technical Field

The application belongs to the technical field of electronic equipment, and particularly relates to a photographing method and device and electronic equipment.

Background

With the rise of intelligent photographing technology, various image matting schemes are layered endlessly, people can photograph by using electronic equipment in daily life or work, and through an intelligent image repairing function in the electronic equipment, an interference object in a picture is cleared by one key, and the background of an area where the interference object in the picture is located is complemented.

Most of the existing intelligent modification schemes simply analyze the symmetry and the seriousness of the background of the picture, and then fill the foreground blocked by the interfering object by using adjacent pixels or symmetrical pixels.

The picture repairing effect of the picture aiming at the simple background is good, such as on a road, in front of a regular building and the like, but under the condition of complex background, the interference objects are eliminated through the scheme, and after the background is complemented in the area where the interference objects are located in the picture, the obtained picture is subjected to the phenomena of artifact, blurring and the like, so that the whole processing trace of the picture is obvious, and the effect is poor.

Disclosure of Invention

The embodiment of the application aims to provide a photographing method, a photographing device and electronic equipment, which can solve the problems of image artifact, blurring and the like caused by the existing intelligent image repairing scheme and obtain an image with a good effect.

In a first aspect, an embodiment of the present application provides a photographing method, where the method includes:

receiving a first input of a user while displaying the preview image;

imaging the preview image in response to the first input to obtain a first image, wherein the first image comprises a first object and M second objects;

acquiring video clips in a first time period after receiving the first input to obtain N frames of second images, wherein M and N are positive integers;

Extracting a first pixel position set of the M second objects in the first image and background pixels at positions corresponding to the M second objects in the first image in the N frames of second images;

And filling the first pixel position set with pixels by using background pixels in positions corresponding to the M second objects in the first image in the N frames of second images to obtain a target image.

In a second aspect, an embodiment of the present application provides a photographing apparatus, including:

a first receiving module for receiving a first input of a user in the case of displaying a preview image;

The first determining module is used for responding to the first input and imaging the preview image to obtain a first image, wherein the first image comprises a first object and M second objects;

the acquisition module is used for acquiring video clips in a first time period after the first input is received to obtain N frames of second images, wherein M and N are positive integers;

The extraction module is used for extracting a first pixel position set of the M second objects in the first image and background pixels at positions corresponding to the M second objects in the first image in the N frames of second images;

and the second determining module is used for filling the pixels of the first pixel position set by using background pixels at positions corresponding to the M second objects in the first image in the N frames of second images to obtain a target image.

In a third aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor perform the steps of the method according to the second aspect.

In a fourth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the second aspect.

In a fifth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement the method according to the second aspect.

In the embodiment of the application, under the condition of displaying a preview image of a shooting scene, imaging the preview image by responding to a first input of a user to obtain a first image, wherein the first image comprises a first object and M second objects, then acquiring video clips of the shooting scene in a first time period after the first input is received to obtain N frames of second images, and then filling pixels in a first pixel position set by using background pixels in positions corresponding to the M second objects in the first image in the N frames of second images of the shooting scene in the first time period after the first input is received.

Drawings

Fig. 1 is a flowchart of a photographing method according to some embodiments of the present application;

FIG. 2 is a schematic diagram of skeleton feature points of a human body provided by some embodiments of the present application;

FIG. 3 is a schematic diagram of a first set of pixel locations provided by some embodiments of the application;

FIG. 4 (a) is a schematic illustration of a first image provided by some embodiments of the application;

FIG. 4 (b) is a schematic illustration of a second image of frame 1 provided by some embodiments of the present application;

FIG. 5 is a schematic illustration of a second image of frame 1 provided by some embodiments of the present application;

FIG. 6 is a schematic illustration of candidate target images provided by some embodiments of the application;

fig. 7 is a flowchart of a photographing method according to some embodiments of the present application;

fig. 8 is a schematic structural view of a photographing apparatus according to some embodiments of the present application;

FIG. 9 is a schematic diagram of an electronic device shown in some embodiments of the application;

fig. 10 is a schematic diagram of a hardware architecture of an electronic device, as shown in some embodiments of the application.

Detailed Description

The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such that embodiments of the application may be practiced otherwise than as specifically illustrated and described herein, and that the objects identified by "first," "second," etc. are generally of a type and are not limited to the number of objects, such as the first object may be one or N. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

Before introducing the technical solution of the embodiment of the present application, the background technology of the embodiment of the present application is first described:

In order to solve the above problems, in the embodiment of the present application, a photographing method, an apparatus and an electronic device are provided, in which, in the case of displaying a preview image of a photographing scene, a first image is obtained by imaging the preview image in response to a first input of a user, the first image includes a first object and M second objects, then a video segment of the photographing scene in a first time period after receiving the first input is acquired, N frames of second images are obtained, then a background pixel set at a position corresponding to the M second objects in the first image in the N frames of second images of the photographing scene in the first time period after receiving the first input is filled with pixels, so that compared with the case of filling the background pixel at a position symmetrical to the first pixel set in the first image in the prior art, the case of facing the background, the solution of the embodiment of the present application can be improved in the case of facing the complex background, and the present application can obtain better effects by the present application.

The technical scheme of the embodiment of the application can be applied to a scene for supplementing the background of the area where the interference object is located in the picture, for example, the object A takes a picture at a tourist attraction, and other interference objects besides the object A exist in the picture, so that the object A needs to supplement the background of the area where the interference object is located after eliminating other interference objects from the picture.

The photographing method provided by the embodiment of the application is described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a photographing method according to an embodiment of the present application, and an execution subject of the method may be an electronic device, where the electronic device may be, but is not limited to, a Personal computer (Personal Computer, PC), a smart phone, a tablet computer, a Personal digital assistant (Personal DIGITALASSISTANT, PDA), or the like. It should be noted that the execution body described above does not limit the embodiments of the present application.

As shown in fig. 1, the photographing method provided in the embodiment of the present application may include steps 110 to 150.

Step 110, receiving a first input of a user in the case of displaying a preview image.

The preview image may be an image of the photographing scene displayed in the electronic device before the user performs the photographing input when photographing the photographing scene using the electronic device, and may include at least one object therein. The shooting scene may be a preset shooting scene, for example, the shooting scene may be a scene of a tourist attraction.

The first input is an input performed on the electronic device, the first input is used for imaging the preview image to obtain a first image, and the first input may be a first operation. The first input may include, but is not limited to, a touch input of a user to the electronic device through a touch device such as a finger or a stylus, or a voice command input by the user, or a specific gesture input by the user, or other feasibility inputs, which may be determined according to actual use requirements, and the embodiment of the present application is not limited. The specific gesture in the embodiment of the application can be any one of a single-click gesture, a sliding gesture, a dragging gesture, a pressure recognition gesture, a long-press gesture, an area change gesture, a double-press gesture and a double-click gesture, and the click input in the embodiment of the application can be single-click input, double-click input or any-time click input, and the like, and can also be long-press input or short-press input. For example, the first input may be a touch input of a photographing control in the electronic device by a user.

And step 120, responding to the first input, and imaging the preview image to obtain a first image, wherein the first image comprises a first object and M second objects.

The first image may be an image of the acquired captured scene, and the first image may include at least one object therein.

The first object may be an object to be retained in the first image, for example, may be a person, for example, may be a holder of the electronic device, or may be at least one object in the first image optionally selected by the user according to the requirement.

The second object may be an object that interferes with the first object.

With continued reference to the above, object a takes a photograph of a tourist attraction, in which there are object B and object C in addition to object a, and if object a wants to process the photograph, object B and object C remain in the photograph only and are removed from the photograph, object B and object C in the photograph are equivalent to second objects.

In some embodiments of the present application, the first image may be obtained by imaging a preview image of the captured scene in response to a first input by a user.

In some embodiments of the present application, to improve the accuracy of the determination of the first object and the second object, before step 110, the above-mentioned method may further comprise:

Receiving a second input of the user to the preview image;

Determining, in response to the second input, first location information for the second input;

After step 120, the above-mentioned method may further comprise:

determining second position information corresponding to at least one object according to the first skeleton coordinate set corresponding to the at least one object respectively;

And determining the first object and M second objects from the at least one object according to the distance between the first position information and the second position information corresponding to the at least one object respectively.

The second input is an input performed on the preview image, the second input is used for determining first position information of the second input, and the second input may be a second operation. The second input includes, but is not limited to, a touch input of the user to the preview image through a touch device such as a finger or a stylus, or a voice command input by the user, or a specific gesture input by the user, or other feasibility inputs, which may be determined according to actual use requirements, and the embodiment of the present application is not limited. The specific gesture in the embodiment of the application can be any one of a single-click gesture, a sliding gesture, a dragging gesture, a pressure recognition gesture, a long-press gesture, an area change gesture, a double-press gesture and a double-click gesture, and the click input in the embodiment of the application can be single-click input, double-click input or any-time click input, and the like, and can also be long-press input or short-press input. For example, the second input may be a touch input of a user to an object in the preview image.

The first location information may be location information at which the second input is performed.

For any one of at least one object in the preview image, the first set of skeleton coordinates corresponding to the object may be a set of skeleton coordinates corresponding to the object.

For any one of at least one object in the preview image, the second location information corresponding to the object may be location information of the object in the preview image.

In some embodiments of the present application, when a user clicks a certain object in a preview image, the position clicked by the user may be recorded and may be used as first position information, then, after the user performs a first input to image the preview image, after the first image is obtained, according to a first skeleton coordinate set corresponding to each object in the first image, second position information corresponding to each object in the first image may be determined, and then, according to the first position information and a distance between the second position information corresponding to at least one object, a first object and M second objects may be determined from at least one object.

In some embodiments of the present application, generally when photographing, a user wants to highlight which object, and clicks which object in the preview image to focus on the object, so when the user clicks on a certain object in the preview image, the position clicked by the user may be recorded and may be used as first position information, that is, the object at the first position information is generally the first object, but the first object and the second object determined in this way are not accurate enough, and the first object and the second object need to be determined according to the distance between the subsequent first position information and the second position information, and a specific determination manner will be described in detail in the subsequent embodiments.

In some embodiments of the present application, the first image may be input into a pre-trained skeleton extraction model, so as to obtain a first skeleton coordinate set corresponding to each object in the first image, which may specifically be extracting 33 skeleton coordinates corresponding to each object.

Referring to fig. 2, fig. 2 is a schematic diagram of 33 skeleton coordinates of a human body, and meanings of the 33 skeleton coordinates in fig. 2 are as follows: "0" is the skeleton coordinates of the nose, "1" is the inner skeleton coordinates of the left eye, "2" is the center skeleton coordinates of the left eye, "3" is the outer skeleton coordinates of the left eye, "4" is the inner skeleton coordinates of the right eye, "5" is the center skeleton coordinates of the right eye, "6" is the outer skeleton coordinates of the right eye, "7" is the left ear skeleton coordinates, "8" is the right ear skeleton coordinates, "9" is the left face skeleton coordinates, "10" is the right face skeleton coordinates, "11" is the left shoulder skeleton coordinates, "12" is the right shoulder skeleton coordinates, "13" is the left elbow skeleton coordinates, "14" is the right elbow skeleton coordinates, "15" is the left wrist skeleton coordinates, "16" is the right wrist skeleton coordinates, "17" is the left little finger skeleton coordinates, "18" is the right little finger skeleton coordinates, "19" is the left index skeleton coordinates, "20" is the right index skeleton coordinates, "21" is the left thumb skeleton coordinates, "23" is the left hip skeleton coordinates, "24" is the right knee skeleton coordinates, "25" is the left knee skeleton "and" 26 "is the heel skeleton" and "30" is the heel skeleton "and" is the heel skeleton.

In some embodiments of the present application, skeleton coordinates located at the barycentric position of the object in the first skeleton coordinate set corresponding to each object may be selected, and determined as second position information corresponding to each object, as shown in fig. 2, a midpoint of a connection line between the "23" skeleton coordinates and the "24" skeleton coordinates of the human body may be selected as the second position information of the object.

In the embodiment of the application, the first position information of the second input is determined in response to the second input of the user to the preview image, and then the second position information corresponding to the at least one object is determined according to the first skeleton coordinate set corresponding to the at least one object respectively, namely, the first object and the M second objects can be determined from the at least one object according to the distance between the first position information and the second position information corresponding to the at least one object respectively, so that the first object and the second object can be determined according to the skeleton coordinate set of each object in the preview image, and the accuracy of the determination of the first object and the second object is improved.

In some embodiments of the present application, in order to further improve accuracy of determining the first object and the second object, the determining, according to the distance between the first location information and the second location information corresponding to the at least one object, the first object and the M second objects from the at least one object may specifically include:

calculating the distance between the first position information and the second position information corresponding to at least one object respectively to obtain the first distance corresponding to at least one object respectively;

Determining an object corresponding to a minimum value in the first distances respectively corresponding to at least one object as a first object;

and determining other objects except the first object in the at least one object as M second objects.

The first distance corresponding to any one of the at least one object may be a distance between the first position information and the second position information corresponding to the object.

In some embodiments of the present application, a distance between the first position information and the second position information corresponding to each object in the first image may be calculated, so as to obtain a first distance corresponding to each object, then an object corresponding to a minimum value in the first distances corresponding to each object is determined to be the first object, and other objects in the first image except for the first object are determined to be the second object.

It should be noted that, before the first input is performed by the user, there is a case that the preview image is clicked multiple times, that is, the user selects the first object multiple times, so before the user performs the first input, the user performs the second input once, and determines the first object and the second object according to the manner of determining the first object and the second object in the first image, that is, after the user performs the second input, the user obtains the position information of the second input, then inputs the frame preview image into the pre-trained skeleton extraction model, so as to obtain the skeleton coordinate set of all objects in the preview image, then determines the position information of each object according to the skeleton coordinate set of each object in the preview image, then calculates the distance between the position information of the second input and the position information of each object in the preview image, and uses the object corresponding to the minimum distance value as the first object and the other objects except the first object as the second object in the preview image. Until the user performs the first input.

In one example, taking m+1 objects in the preview image, 33 skeleton coordinates of each object as an example, after a user clicks a certain object in the preview image, recording clicked position information K, inputting the frame preview image into a pre-trained skeleton extraction model, obtaining a skeleton coordinate set U1 (pt 1, pt 2) corresponding to each m+1 object in the preview image, and then calculating distances between the position information K and the position information K1, K2, km+1 of each object, and dm+1 of each object, and selecting distances between the position information K1, K2, km+1 of each object and d+1 of each object as the positions of the object, and selecting distances between the position information K1, K2, km+1 of each object and dm+1 of the object as the positions of the object, and selecting distances between the position information K1, km+1 of each object and dm+1 of the object.

If the user does not execute the first input, continuing to send the preview image of the next frame into the pre-trained skeleton model, obtaining skeleton coordinate sets U1 (pt 1, pt2,) corresponding to m+1 objects in the preview image, and pt 33), U2 (pt 1, pt2,) and pt33,) and (the first input is executed by the user, repeatedly executing the steps, continuously updating the skeleton coordinate set of each object in the preview image, and if the user does not execute the first input, executing the processes in a circulating mode until the user executes the first input.

After the user executes the first input, the first image is also input into a pre-trained skeleton extraction model to obtain a skeleton coordinate set of all objects in the preview image, then position information of the last time the user clicks the preview image before the first input is executed is acquired and is used as first position information, and then the first object and the second object in the first image are determined according to the mode of determining the first object and the second object in the first image.

In the embodiment of the application, the distance between the first position information and the second position information corresponding to at least one object is calculated to obtain the first distance corresponding to at least one object, then the object corresponding to the minimum value in the first distance corresponding to at least one object is determined as the first object, and other objects except the first object in at least one object are determined as M second objects, so that the first object and the second object in the first image can be accurately determined, and the accuracy of determining the first object and the second object is further improved.

And 130, acquiring video clips in a first time period after the first input is received, and obtaining N frames of second images.

The first period of time may be a preset period of time, for example, may be 1 minute, and specifically, the first period of time may be set according to a user requirement, which is not limited in the embodiment of the present application.

M and N may both be positive integers.

The second image may be an image of a photographed scene photographed within a first period of time since the first input is received.

In some embodiments of the present application, after the first input is performed to obtain the first image, the time for obtaining the first image may be recorded, then the photographing device is kept still, photographing is continued, a video stream of the photographed scene in the first period of time is obtained, and then a plurality of video frames may be intercepted from the video stream to obtain N frames of the second image.

Step 140, extracting a first pixel position set of M second objects in the first image and background pixels in positions corresponding to the M second objects in the first image in the N frames of second images.

The first set of pixel positions may be a set of pixel positions corresponding to M second objects in the first image.

In some embodiments of the present application, after the first image and the N frames of second images are acquired, a first set of pixel positions of M second objects in the first image and background pixels in the N frames of second images at positions corresponding to the M second objects in the first image may be extracted.

In some embodiments of the present application, in order to accurately obtain the first set of pixel positions, the extracting the first set of pixel positions of the M second objects in the first image may specifically include:

Converting the first image into a gray scale image to obtain a first gray scale image;

dividing the foreground of the first gray level image to obtain a first foreground image;

And determining a first pixel position set of the M second objects in the first image based on the first skeleton coordinate sets respectively corresponding to the first foreground image and the M second objects.

The first gray-scale image may be an image obtained by converting the first image into a gray-scale image.

The first foreground image may be an image obtained by extracting a foreground of the first grayscale image.

In some embodiments of the present application, a first gray image is obtained by converting a first image into a gray image, and then a foreground in the first gray image is extracted to obtain a first foreground image, specifically, a K nearest neighbor classification algorithm (K-NearestNeighbor, KNN) is used to extract the foreground of the first gray image to obtain the first foreground image, and then a first set of pixel positions of M second objects in the first image can be obtained according to a first set of skeleton coordinates corresponding to the first foreground image and the M second objects respectively.

In the embodiment of the application, the first image is converted into the gray level image to obtain the first gray level image, the foreground in the first gray level image is extracted to obtain the first foreground image, and then the first pixel position set of M second objects in the first image can be accurately obtained according to the first skeleton coordinate set respectively corresponding to the first foreground image and the M second objects.

In some embodiments of the present application, determining a first set of pixel positions of M second objects in the first image based on a first set of skeleton coordinates corresponding to the first foreground image and the M second objects, respectively, may specifically include:

Determining foreground areas corresponding to the M second objects based on first skeleton coordinate sets respectively corresponding to the first foreground image and the M second objects;

And adding masks to pixels of other foreground areas except the foreground areas corresponding to the M second objects in the first foreground image to obtain a first pixel position set of the M second objects in the first image.

In some embodiments of the present application, a first foreground image and a first skeleton coordinate set corresponding to M second objects may be compared, a foreground region corresponding to M second objects may be determined, and then pixels of other foreground regions in the first foreground image except for the foreground region corresponding to M second objects may be added with a mask, so as to obtain a first pixel position set of M second objects in the first image.

With continued reference to the above example, the first image mat_k is converted into a gray image, so as to obtain a first gray image gray_k, then the foreground region in the first gray image gray_k is extracted, so as to obtain a first foreground region, then the first foreground region and the first skeleton coordinate sets corresponding to the M second objects respectively are compared, so that the foreground region corresponding to the M second objects can be determined, the pixels of the other foreground regions except for the foreground region corresponding to the M second objects in the first foreground image are added with a mask, specifically, the pixel values of the pixels of the other foreground regions except for the foreground region corresponding to the M second objects in the first foreground image are updated to 0, so that the first pixel position set 31 of the M second objects in the first image selected as shown in fig. 3 can be obtained, as can be seen from fig. 3, the pixel values of the pixel points of the first pixel position set 31 selected in the frame are 255, the pixel values of the other pixel points in the first foreground image are 0, and the pixel values of the other pixel points in the first foreground image are extracted as such that the first pixel position set 31 can be extracted.

In the embodiment of the application, the foreground regions corresponding to the M second objects can be determined by comparing the first skeleton coordinate sets corresponding to the first foreground image and the M second objects respectively, and then the mask is added to the pixels of other foreground regions except the foreground regions corresponding to the M second objects in the first foreground image, so that the first pixel position set of the M second objects in the first image can be accurately obtained, and the determination accuracy of the first pixel position set is further improved.

In some embodiments of the present application, in order to accurately extract background pixels in N frames of second images at positions corresponding to M second objects in the first image, the extracting background pixels in N frames of second images at positions corresponding to M second objects in the first image may specifically include:

Matching an ith frame of second images in the N frames of second images with the first image to obtain a second pixel position set matched with the first pixel position set in the ith frame of second images;

And determining the background pixels at the second pixel position set in the second image of the ith frame as the background pixels at the positions corresponding to the M second objects in the first image in the second image of the ith frame.

The i-th frame second image can be any one of N frames of second images, i can be a positive integer, and i is more than or equal to 1 and less than or equal to N.

The second set of pixel locations may be a set of pixel locations in the second image of the i-th frame that match the first set of pixel locations.

In some embodiments of the present application, an i-th frame second image in the N-th frame second image may be matched with the first image, a second pixel position set in the i-th frame second image that is matched with the first pixel position set is obtained, and then a background pixel at the second pixel position set is determined as a background pixel at a position corresponding to M second objects in the first image in the i-th frame second image.

With continued reference to the above example, object a takes a photograph Q of a tourist attraction, in which, in addition to object a, there are object B and object C, taking object a as a first object, object B and object C as second objects, taking the photograph device as an example, after taking photograph Q, a video clip of 1 minute is still taken, in which 10 frames of second images are taken, object B leaves during the process of taking the 10 frames of second images, and then object D is added to the shooting scene, so that in one or more frames of second images of the 10 frames, there are object C and object D in addition to object a, for example, there may be object C and object D in the 1 st frame of second image of the 10 frames of second images.

Referring to fig. 4 (a) and 4 (B), fig. 4 (a) is a schematic diagram of a first image, in which an object a, an object B, and an object C are included, the first pixel position set is a region including the object B and the object C, that is, a region 41 in fig. 4 (a), fig. 4 (B) is a schematic diagram of an i-th frame second image in an N-th frame second image, taking a 1-th frame second image in a 10-th frame second image as an example, in which the object C and the object D are included in the 1-th frame second image in addition to the object a, and in the 1-th frame second image, a region 42 in fig. 4 (B), that is, a background pixel of the region 42, that is, a background pixel corresponding to the region 41 in the first image in the 1-th frame second image is matched.

It should be noted that, for each of the N frames of second images, the background pixels at the positions corresponding to the M second objects in the first image in each of the N frames of second images may be determined according to the above-mentioned determination method of the background pixels at the positions corresponding to the M second objects in the first image in the i-th frame of second image.

In the embodiment of the application, the second pixel position set matched with the first pixel position set in the second image of the ith frame is obtained by matching the second image of the ith frame in the second image of the N frames with the first image, so that the background pixel at the second pixel position set can be directly determined as the background pixel at the position corresponding to M second objects in the first image in the second image of the ith frame, the background pixel at the position corresponding to M second objects in the first image in the second image of the ith frame is accurately determined, and the determination accuracy of the background pixel at the position corresponding to M second objects in the first image in the second image of the ith frame is improved.

And 150, performing pixel filling on the first pixel position set by using background pixels in positions corresponding to M second objects in the first image in the N frames of second images to obtain a target image.

The target image may be an image obtained by filling the first pixel position set with background pixels at positions corresponding to M second objects in the first image in the N frames of second images.

In some embodiments of the present application, in order to improve the artifacts and blurring problem of the picture obtained by the prior art solution, step 150 may specifically include:

extracting a third pixel position set of P third objects in the second image of the ith frame;

calculating an intersection of the first pixel position set and the third pixel position set to obtain a fourth pixel position set;

and filling background pixels of a first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set to obtain a target image.

Wherein the third object may be other objects than the first object in the i-th frame second image. The third object here may be P, P being a positive integer.

The P third objects and the M second objects may not be identical. This is because there may be a case where some objects leave and another object enter the shooting scene during shooting, so the P third objects and the M second objects may not be identical here.

The third set of pixel positions may be a set of pixel positions corresponding to P third objects in the i-th frame image.

The fourth pixel position set may be a pixel position set corresponding to the intersection object in the obtained third pixel position set according to the intersection object between the P third objects and the M second objects.

In some embodiments of the present application, an intersection object between P third objects and M second objects may be calculated first, and then a pixel location set corresponding to the intersection object in the third pixel location set is taken as a fourth pixel location set.

With continued reference to the above example, referring to fig. 5, the third set of pixel positions corresponding to the object C and the object D in the second image of the 1 st frame is the region 51, and then the intersection between the region 41 in fig. 4 (a) and the region 51 in fig. 5 is calculated, and the intersection between the region 41 in fig. 4 (a) and the region 51 in fig. 5 is the set of pixel positions corresponding to the object C, that is, the region 52 in fig. 5.

The first set of target pixel locations may be a set of pixel locations in the third set of pixel locations other than the fourth set of pixel locations.

With continued reference to fig. 5, the set of pixel positions in the region 51 except the region 52 is determined as the first target set of pixel positions, that is, the set of pixel positions corresponding to the object D in fig. 5 is the first target set of pixel positions, that is, the region 53 in fig. 5 is the first target set of pixel positions.

The second set of target pixel locations may be a set of pixel locations in the first set of pixel locations that match the first set of target pixel locations.

With continued reference to fig. 4 (a) and fig. 5, the position of the object B in fig. 4 (a) corresponds to the position of the object D in fig. 5, so the position of the object B in fig. 4 (a) is the second set of target pixel positions, i.e. the region 43 in fig. 4 (a).

In some embodiments of the present application, the target image may be obtained by extracting a third set of pixel positions of P third objects in the second image of the i-th frame, calculating an intersection of the first set of pixel positions and the third set of pixel positions to obtain a fourth set of pixel positions, and then filling a background pixel of the first set of target pixel positions in the third set of pixel positions into the second set of target pixel positions in the first set of pixel positions.

With continued reference to fig. 4 (a) and 5, the background pixels at region 53 in fig. 5 may be filled into region 43 in fig. 4 (a), such that the background of the region of fig. 4 (a) where object B is located may be filled.

In the embodiment of the application, compared with the prior art that background pixels at positions symmetrical to the first pixel position set in the first image are used for filling, the embodiment of the application can improve the artifact and the blurring problem of the picture obtained by the prior art scheme by extracting the third pixel position set of the P third objects in the second image of the ith frame, then calculating the intersection of the first pixel position set and the third pixel position set to obtain a fourth pixel position set, filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set to obtain a target image, and obtaining the image with better effect.

In some embodiments of the present application, after the second target pixel position set in the first image is filled with the second image of the i-th frame, the obtained image may still have other second objects not yet filled with the background, so the obtained image is not yet the final image, and therefore, the obtained image needs to be further processed. Specifically, the method can be carried out as follows:

In some embodiments of the present application, in order to further improve the artifacts and blurring problem of the picture obtained by the prior art solution, the filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set to obtain the target image may specifically include:

Filling background pixels of a first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set to obtain a candidate target image;

updating the first pixel position set to be a pixel position set except for the second target pixel position set in the first pixel position set;

and under the condition that the N frames of second images are not matched with the first image and the updated first pixel position set is not empty, updating the j second image to the i second image, updating the candidate target image to the first image, and returning to execute the step of filling the first pixel position set with the N frames of second images to obtain the target image until the N frames of second images are matched with the first image or the updated first pixel position set is empty.

The candidate target image may be an image obtained by filling a background pixel of a first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set.

The j-th second image may be any one of the N-frame second images. Where j is a positive integer, and 0.ltoreq.j.ltoreq.N, j.noteq.i, i.e., where the jth second image may be any one of the N second images except the ith second image.

In some embodiments of the present application, after filling background pixels of a first target pixel position set in a third pixel position set into a second target pixel position set in the first pixel position set to obtain a candidate target image, the first pixel position set may be updated to be a pixel position set other than the second target pixel position set in the first pixel position set, and then when it is determined that N frames of second images have second images that do not match the first image and the updated first pixel position set is not empty, that is, when there are second images in N frames of second images that do not match the first image and an area where a second object in the first image is located is not completely filled with background, the j-th second image may be updated to be the i-th second image, the step of filling the first pixel position set with pixels by using the N frames of second images is performed back to obtain the target image until the N frames of second images are all matched with the first image or the updated first pixel position set is empty.

With continued reference to the above example, after the region 43 in fig. 4 (a) is background-filled with the background pixels of the region 53 in fig. 5, the region 41 in fig. 4 (a) is updated to the set of other pixel positions in the region 41 except for the region 43, that is, the set of pixel positions corresponding to the object C in fig. 4 (a).

At present, only the area where the object C is located in the first image is left unfilled, and the area where the second object C is located in the first image is still filled from the second image of the 2 nd frame to the second image of the 10 th frame, then the second image of the 2 nd frame can be continuously obtained, then the area where the object C is located in the first image is filled with the background according to the mode that the area where the object B is located in the first image is filled with the background by the second image of the 1 st frame, then whether the second image of the 10 th frame is not used for being matched with the first image is judged, and whether the area where the object C is located in the first image is filled with the background is judged, if the area where the second image of the 10 th frame is not used for being matched with the first image is determined, the first pixel position set in the first image is filled with the background, then the second image of the 3rd frame is continuously obtained, then the background is filled with the first pixel position set in the first image of the first image according to the mode that the second image of the 1 st frame is filled with the background, or the area where the object C is located in the first image is filled with the background is completely, and the first image is filled with the background position set in the first image is completely.

In the embodiment of the application, the candidate target image is obtained by filling the background pixel of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set, then updating the first pixel position set to be the fourth pixel position set, and under the condition that the second image which is not matched with the first image exists in the N frames of second images and the updated first pixel position set is not empty, updating the j second image to be the i second image, updating the candidate target image to be the first image, and returning to execute the step of filling the first pixel position set by using the N frames of second images to obtain the target image until the N frames of second images are matched with the first image or the updated first pixel position set is empty.

In some embodiments of the present application, in order to obtain a target image meeting the user requirement, after said updating the first set of pixel positions to the fourth set of pixel positions, the above-mentioned method may further comprise:

Under the condition that the N frames of second images are matched with the first image, and the updated first pixel position set is not empty, determining a fifth pixel position set matched with the updated first pixel position set in the N frames of second images;

And filling background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N frames of second images into the updated first pixel position set to obtain a target image.

The fifth pixel position set is a set of pixel positions matched with the updated first pixel position set in the N frames of the second image.

The sixth set of pixel locations is a set of pixel locations in the N frames of the second image that are symmetrical to the fifth set of pixel locations.

In some embodiments of the present application, in a case where it is determined that the N frames of second images are all matched with the first image and the updated first pixel position set is not empty, that is, the N frames of second images are all used to match with the first image to perform background filling on the first pixel position set in the first image, and the region where the second object is located in the first image is not completely filled, a fifth pixel position set matched with the updated first pixel position set in the N frames of second images may be determined, and then background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N frames of second images are filled into the updated first pixel position set to obtain the target image.

When determining the fifth pixel position set matched with the updated first pixel position set in the N frames of second images, any one frame or any several frames of second images in the N frames of second images may be selected, and the fifth pixel position set matched with the updated first pixel position set may be determined from any one frame or any several frames of second images selected.

When any one frame or any several frames of second images are selected from the N frames of second images, the second image with complete background in the sixth pixel position set may be selected, so that when the updated first pixel position set is filled with the background pixels in the sixth pixel position set of the frame of second images, the updated first pixel position set may be well filled.

With continued reference to the above example, taking a certain frame of the second image selected from the N frames of the second image, for example, the 1 st frame of the second image in fig. 5, after updating the region 41 in fig. 4 (a) to the set of pixel positions corresponding to the object C, referring to fig. 6, fig. 6 is a schematic diagram of a candidate target image, in which only the region 61 where the object C is located is left unfilled, and in fig. 5, the region matching the region 61 in fig. 6 is the region 52, that is, the region 52 is the fifth set of pixel positions.

With continued reference to fig. 5, in fig. 5, a region 54 symmetrical to the region 52 is taken as the sixth set of pixel locations.

In the embodiment of the application, under the condition that N frames of second images are matched with the first image and the updated first pixel position set is not empty, a fifth pixel position set matched with the updated first pixel position set in the N frames of second images can be determined, then background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N frames of second images are filled into the updated first pixel position set to obtain a target image, and therefore all the first pixel position sets corresponding to M second objects in the first image can be filled, and the obtained target image is the image with the completed background filling, so that the requirements of users are met.

In some embodiments of the present application, in order to improve accuracy of background filling of the first set of pixel positions in the first image, before the matching the i-th frame second image in the N-th frame second image with the first image, obtaining the second set of pixel positions in the i-th frame second image that are matched with the first set of pixel position information, the method may further include:

converting the second image of the ith frame into a gray scale image to obtain a second gray scale image;

matching the second gray level image with the first gray level image to obtain a transformation matrix between the second gray level image and the first gray level image;

converting the second gray level image into a coordinate system corresponding to the first gray level image based on the transformation matrix to obtain a third gray level image;

The matching the ith frame of second image in the N frames of second images with the first image to obtain a second pixel position set matched with the first pixel position information set in the ith frame of second image may specifically include:

and matching the third gray level image with the first image to obtain a second pixel position set matched with the first pixel position information set in the third gray level image.

The second gray scale image may be a gray scale image obtained by converting the i-th frame second image into a gray scale image.

The third grayscale image may be a grayscale image obtained by converting the second grayscale image into a coordinate system corresponding to the first grayscale image based on a transformation matrix between the first grayscale image and the second grayscale image.

In some embodiments of the present application, after the first input is performed, the user needs to keep the photographing device motionless and stay for the first period of time, however, the handheld photographing device cannot guarantee absolute rest of the photographing device, and may have uncertain factors such as jitter, so that the frame range, the coordinate system, etc. defined by the N frames of second images and the first images in the video clip are inconsistent, so that the i frames of second images need to be converted into gray images to obtain second gray images, then the second gray images and the first gray images are matched to obtain a transformation matrix between the second gray images and the first gray images, the second gray images are converted into the coordinate system corresponding to the first gray images based on the transformation matrix, and a third gray image is obtained, so that the third gray image and the first images can be matched, and a second pixel position set matched with the first pixel position information set in the third gray image is obtained.

In the embodiment of the application, the second gray image is obtained by converting the second image of the ith frame into the gray image, the second gray image is matched with the first gray image to obtain the transformation matrix between the second gray image and the first gray image, then the second gray image is converted into the coordinate system corresponding to the first gray image based on the transformation matrix to obtain the third gray image, and the matching can be carried out on the follow-up third gray image and the first image, so that the consistency of the picture range and the coordinate system defined by the N frames of the second image and the first image is ensured, and the accuracy of background filling on the first pixel position set in the first image is further improved.

In some embodiments of the present application, in order to accurately obtain a transformation matrix between the second gray scale image and the first gray scale image, the matching the second gray scale image and the first gray scale image to obtain a transformation matrix between the second gray scale image and the first gray scale image may specifically include:

matching the features in the second gray level image with the features in the first gray level image to obtain Q matching point pairs;

and determining a transformation matrix between the second gray level image and the first gray level image according to the Q matching point pairs.

The matching point pair may be a feature pair in which the feature in the second gray scale image and the feature in the first gray scale image match each other.

In some embodiments of the present application, a feature detection matching algorithm may be used to detect and match features in the first gray image and features in the second gray image to obtain Q matching point pairs, and then, according to the Q matching point pairs, a transformation matrix between the second gray image and the first gray image may be obtained.

In some embodiments of the present application, the feature detection matching algorithm described above may include, but is not limited to, a Scale-invariant feature transform (SIFT) algorithm, a fast feature point extraction description (OrientedFast andRotatedBrief, ORB) algorithm, or a Scale-rotation invariant feature extraction description (Speeded Up RobustFeature, SURF) algorithm, among others.

In some embodiments of the present application, taking SIFT algorithm as an example, a plurality of feature points are found from a first gray image and a second gray image by SIFT algorithm, so as to obtain descriptors of each feature point in the first gray image and the second gray image, then calculate distances between the descriptors of each feature point in the first gray image and the descriptors of each feature point in the second gray image, respectively, and determine two feature points with distances smaller than a certain threshold as a matching point pair, so as to obtain Q matching point pairs, and then process the Q matching point pairs by using cv2.getperformTransform () function, so as to obtain a transformation matrix between the second gray image and the first gray image.

In the embodiment of the application, the Q matching point pairs are obtained by matching the features in the second gray level image with the features in the first gray level image, and the transformation matrix between the second gray level image and the first gray level image is determined according to the Q matching point pairs, so that the transformation matrix between the second gray level image and the first gray level image can be accurately obtained.

In order to more clearly understand the scheme of the embodiment of the present application, a photographing method provided by the embodiment of the present application is described below in a specific scenario, and fig. 7 is a schematic flow chart of a photographing method provided by the embodiment of the present application, and as shown in fig. 7, the photographing method provided by the embodiment of the present application may include steps 1 to 14.

And step 1, displaying a preview image of the shooting scene.

And 2, responding to the second input of the user to the preview image, and recording a skeleton coordinate set of the first object.

In step 2, the skeleton coordinate set of the first object may be determined in the manner in the foregoing embodiment, which is not described herein.

And 3, whether the first input is executed or not, if yes, executing the step 4, and if not, returning to execute the step 2.

In step 3, it is determined whether the user has performed the first input, if yes, step 4 is performed, if not, step 2 is performed again, the preview image of the shooting scene is redisplayed, then the first object selected by the user is tracked, and the skeleton coordinate set of the first object is updated.

And 4, storing a first skeleton coordinate set of the first image, the first object and the M second objects, and recording the time of the first input.

In step 4, after the user performs the first input, a first image may be obtained, the first image is saved, a skeleton coordinate set of the first object and skeleton coordinate sets of M second objects in the first image are saved, and then the time of the first input is recorded.

And 5, taking a first gray level image of the first image, and calculating a first pixel position set of M second objects in the first gray level image.

In step 5, a first gray image gray_k of the first image mat_k is acquired, and then a first set Zc of pixel positions where M second objects are located in the first gray image gray_k is calculated.

And 6, acquiring a second image of the ith frame, and calculating a second gray level image of the second image of the ith frame.

In step 6, according to the time of the first input, video segments in a first time period after the first input can be acquired to obtain N frames of second images, then the i-th frame of second images mat_video is acquired from the N frames of second images, and then the second gray-scale image gray_video of the i-th frame of second images is obtained.

And 7, calculating a transformation matrix between the first gray level image and the second gray level image.

In step 7, the calculation of the transformation matrix between the first gray scale image and the second gray scale image may be performed in the manner of the calculation of the transformation matrix in the above embodiment, which is not described herein.

And 8, converting the second gray level image into the coordinate system of the first gray level image to obtain a third gray level image.

In step 8, the mat_video may be subjected to perspective transformation to obtain a transformed map mat_video_transform, and then subjected to gray level transformation to obtain a corresponding third gray level map gray_video_transform.

And 9, performing foreground segmentation on the third gray level image to obtain a third pixel position set of P third objects in the third gray level image.

In step 9, foreground segmentation is performed on the third gray-scale image, so as to obtain a third pixel position set Zm of the P third objects in the third gray-scale image.

And step 10, filling background pixels of the first target pixel position set in the third gray level image into the second target pixel position set in the first pixel position set, and updating the first pixel position set.

In step 10, zc- (Zc n Zm) in the gray_video_transform, i.e. the background pixels in the first set of target pixel positions, may be filled into the mat_k, in particular into the second set of target pixel positions in the mat_k, and zc= (Zc n Zm) in the first image may be updated.

And step 11, judging whether all the N frames of second images are used and whether the first pixel position set is empty, if not, returning to the step 6, and if so, executing the step 12.

In step 11, if the N frames of second images are not all used and the first pixel position set is not empty, the step 6 is returned to acquire the j frame of second images again, and the steps 6-11 are executed again until the N frames of second images are all used and the first pixel position set is empty.

Step 12, determining whether the first pixel position set is empty, if yes, executing step 13, and if not, executing step 14.

And step 13, returning to the target image.

And 14, carrying out symmetry and sequence analysis and filling on the residual area.

In step 14, performing symmetry and sequential analysis on the remaining area and filling, that is, in the above embodiment, when it is determined that the N frames of second images are all matched with the first image and the updated first pixel position set is not empty, determining a fifth pixel position set matched with the updated first pixel position set in the N frames of second images, and filling background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N frames of second images into the updated first pixel position set, thereby obtaining the target image.

According to the photographing method provided by the embodiment of the application, the execution subject can be a photographing device. In the embodiment of the application, taking a photographing device to execute a photographing method as an example, the photographing device provided by the embodiment of the application is described.

Fig. 8 is a schematic structural view of an information processing apparatus according to an exemplary embodiment. As shown in fig. 8, the photographing apparatus 800 may include:

A first receiving module 810 for receiving a first input of a user in case of displaying a preview image;

a first determining module 820, configured to respond to the first input, and image the preview image to obtain a first image, where the first image includes a first object and M second objects;

the acquisition module 830 is configured to acquire N frames of second images from video clips in a first period after the first input is received, where M and N are positive integers;

An extraction module 840, configured to extract a first set of pixel positions of the M second objects in the first image and background pixels in the N frames of second images at positions corresponding to the M second objects in the first image;

And a second determining module 850, configured to perform pixel filling on the first pixel location set by using background pixels in positions corresponding to the M second objects in the first image in the N frames of second images, so as to obtain a target image.

In some embodiments of the present application, the preview image includes at least one object, and the apparatus may further include:

a second receiving module for receiving a second input of the user to the preview image before the receiving the first input of the user;

a third determination module for determining first location information of the second input in response to the second input;

A fourth determining module, configured to determine, after the obtaining the first image, second position information corresponding to the at least one object according to a first skeleton coordinate set corresponding to the at least one object respectively;

and a fifth determining module, configured to determine the first object and M second objects from the at least one object according to distances between the first location information and the second location information corresponding to the at least one object, respectively.

In some embodiments of the present application, the fifth determining module is specifically configured to:

Calculating the distance between the first position information and the second position information corresponding to the at least one object respectively to obtain the first distance corresponding to the at least one object respectively;

determining an object corresponding to a minimum value in first distances respectively corresponding to the at least one object as the first object;

And determining other objects except the first object in the at least one object as the M second objects.

In some embodiments of the present application, the extraction module 840 may include:

The first determining unit is used for converting the first image into a gray scale image to obtain a first gray scale image;

the first segmentation unit is used for segmenting the foreground of the first gray level image to obtain a first foreground image;

And the second determining unit is used for determining a first pixel position set of the M second objects in the first image based on the first skeleton coordinate set corresponding to the first foreground image and the M second objects respectively.

In some embodiments of the present application, the second determining unit is specifically configured to:

Determining foreground areas corresponding to the M second objects based on the first skeleton coordinate sets respectively corresponding to the first foreground image and the M second objects;

In some embodiments of the present application, the extraction module 840 may further include:

The acquisition unit is used for matching an ith frame of second images with the first image and acquiring a second pixel position set matched with the first pixel position set in the ith frame of second images, wherein i is a positive integer, and i is more than or equal to 1 and less than or equal to N;

a third determining unit, configured to determine a background pixel at the second pixel position set in the second image of the ith frame as a background pixel at a position corresponding to the M second objects in the first image in the second image of the ith frame.

In some embodiments of the present application, the second determining module 850 may include:

An extracting unit, configured to extract a third pixel position set of P third objects in the second image of the ith frame, where the P third objects are not identical to the M second objects, and P is a positive integer;

A fourth determining unit, configured to calculate an intersection of the first pixel location set and the third pixel location set, to obtain a fourth pixel location set;

A fifth determining unit, configured to fill a background pixel of a first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set to obtain a target image, where the first target pixel position set is a pixel position set in the second pixel position set except for the fourth pixel position set, and the second target pixel position set is a pixel position set matched with the first target pixel position set in the first pixel position set.

In some embodiments of the present application, the fifth determining unit is specifically configured to:

Updating the first set of pixel locations to the fourth set of pixel locations;

And under the condition that the N frames of second images are not matched with the first image and the updated first pixel position set is not empty, updating the j-th second image to the i-th second image, updating the candidate target image to the first image, and returning to execute the step of filling the first pixel position set with the N frames of second images to obtain a target image until the N frames of second images are matched with the first image or the updated first pixel position set is empty, j is a positive integer, and j is not less than or equal to 1 and not more than N, and j is not equal to i.

In some embodiments of the present application, after said updating said first set of pixel positions to said fourth set of pixel positions, said fifth determining unit is further specifically configured to:

determining a fifth pixel position set matched with the updated first pixel position set in the N frames of second images under the condition that the N frames of second images are completely matched with the first image and the updated first pixel position set is not empty;

In some embodiments of the present application, the apparatus referred to above may further comprise:

A sixth determining module, configured to convert, before the matching the ith frame of second image in the N frames of second images with the first image to obtain a second pixel position set matched with the first pixel position information set in the ith frame of second image, the ith frame of second image into a gray scale map to obtain a second gray scale image;

A seventh determining module, configured to match the second gray scale image with the first gray scale image to obtain a transformation matrix between the second gray scale image and the first gray scale image;

an eighth determining module, configured to convert, based on the transformation matrix, the second gray scale image to a coordinate system corresponding to the first gray scale image, to obtain a third gray scale image;

the acquisition unit is specifically configured to:

In some embodiments of the present application, the seventh determining module is specifically configured to:

The photographing device in the embodiment of the application can be electronic equipment or a component in the electronic equipment, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. The electronic device may be a Mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a Mobile internet appliance (Mobile INTERNET DEVICE, MID), an augmented reality (augmentedreality, AR)/Virtual Reality (VR) device, a robot, a wearable device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and may also be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., which are not particularly limited in the embodiments of the present application.

The photographing device in the embodiment of the application can be a device with an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.

The photographing device provided by the embodiment of the present application can implement each process implemented by the method embodiment of fig. 1, and in order to avoid repetition, details are not repeated here.

Optionally, as shown in fig. 9, the embodiment of the present application further provides an electronic device 900, which includes a processor 901 and a memory 902, where a program or an instruction that can be executed on the processor 901 is stored in the memory 902, and when the program or the instruction is executed by the processor 901, the steps of the embodiment of the photographing method are implemented, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.

The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.

Fig. 10 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

The electronic device 1000 includes, but is not limited to, a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and a processor 1010.

Those skilled in the art will appreciate that the electronic device 1000 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1010 by a power management system to perform functions such as managing charge, discharge, and power consumption by the power management system. The electronic device structure shown in fig. 10 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.

Wherein, the user input unit 1007 is configured to receive a first input of a user in a case of displaying a preview image;

The processor 1010 is configured to respond to the first input, image the preview image to obtain a first image, where the first image includes a first object and M second objects, collect video segments of the shooting scene in a first period after receiving the first input to obtain N frames of second images, where M and N are positive integers, extract a first pixel position set of the M second objects in the first image and background pixels corresponding to the M second objects in the first image in the N frames of second images, and fill pixels in the first pixel position set with the background pixels corresponding to the M second objects in the first image in the N frames of second images to obtain a target image.

In this way, in the case of displaying a preview image of a shooting scene, by imaging the preview image in response to a first input of a user to obtain a first image, including a first object and M second objects in the first image, then acquiring a video clip of the shooting scene in a first period after receiving the first input to obtain N frames of second images, and then performing pixel filling on the first pixel position set by using background pixels in positions corresponding to the M second objects in the first image in the acquired N frames of second images of the shooting scene in the first period after receiving the first input.

Optionally, the preview image includes at least one object, and the user input unit 1007 is further configured to receive a second input from the user on the preview image before the receiving the first input from the user;

The processor 1010 is further configured to determine, in response to the second input, first location information of the second input, determine, after the obtaining of the first image, second location information of the at least one object according to a first skeleton coordinate set corresponding to the at least one object, respectively, and determine, from the at least one object, the first object and M second objects according to distances of the first location information and the second location information of the at least one object, respectively.

In this way, by responding to the second input of the user to the preview image, the first position information of the second input is determined, then the second position information corresponding to the at least one object is determined according to the first skeleton coordinate set corresponding to the at least one object, that is, the first object and the M second objects can be determined from the at least one object according to the distance between the first position information and the second position information corresponding to the at least one object, so that the first object and the second object can be determined according to the skeleton coordinate set of each object in the preview image, and the accuracy of determining the first object and the second object is improved.

Optionally, the processor 1010 is further configured to calculate a distance between the first location information and the second location information corresponding to the at least one object, so as to obtain a first distance corresponding to the at least one object, determine an object corresponding to a minimum value in the first distances corresponding to the at least one object as the first object, and determine other objects except the first object in the at least one object as the M second objects.

In this way, the distance between the first position information and the second position information corresponding to at least one object is calculated to obtain the first distance corresponding to at least one object, then the object corresponding to the minimum value in the first distances corresponding to at least one object is determined to be the first object, other objects except the first object in at least one object are determined to be M second objects, and therefore the first object and the second object in the first image can be accurately determined, and the accuracy of determining the first object and the second object is further improved.

Optionally, the processor 1010 is further configured to convert the first image into a gray scale image to obtain a first gray scale image, segment a foreground of the first gray scale image to obtain a first foreground image, and determine a first set of pixel positions of the M second objects in the first image based on the first set of skeleton coordinates corresponding to the first foreground image and the M second objects, respectively.

In this way, the first image is converted into the gray level image to obtain the first gray level image, then the foreground in the first gray level image is extracted to obtain the first foreground image, and then the first pixel position set of the M second objects in the first image can be accurately obtained according to the first skeleton coordinate set respectively corresponding to the first foreground image and the M second objects.

Optionally, the processor 1010 is further configured to determine foreground regions corresponding to the M second objects based on the first skeleton coordinate sets corresponding to the first foreground image and the M second objects, and add masks to pixels of other foreground regions in the first foreground image except for the foreground regions corresponding to the M second objects, so as to obtain a first pixel position set of the M second objects in the first image.

In this way, the foreground regions corresponding to the M second objects can be determined by comparing the first foreground image with the first skeleton coordinate sets corresponding to the M second objects, and then the mask is added to the pixels of the other foreground regions except the foreground regions corresponding to the M second objects in the first foreground image, so that the first pixel position sets of the M second objects in the first image can be accurately obtained, and the determination accuracy of the first pixel position sets is further improved.

Optionally, the processor 1010 is further configured to match an ith frame of the N frame of second images with the first image, obtain a second set of pixel positions in the ith frame of second images that are matched with the first set of pixel positions, where i is a positive integer and 1 is equal to or less than i is equal to or less than N, and determine a background pixel in the second set of pixel positions in the ith frame of second images as a background pixel in the ith frame of second images that corresponds to the M second objects in the first image.

In this way, by matching the ith frame of second image in the N frames of second images with the first image, the second pixel position set matched with the first pixel position set in the ith frame of second image is obtained, so that the background pixel at the second pixel position set can be directly determined as the background pixel at the position corresponding to M second objects in the first image in the ith frame of second image, the background pixel at the position corresponding to M second objects in the first image in the ith frame of second image is accurately determined, and the determination accuracy of the background pixel at the position corresponding to M second objects in the first image in the ith frame of second image is improved.

Optionally, the processor 1010 is further configured to extract a third pixel position set of P third objects in the second image of the i-th frame, where the P third objects are not identical to the M second objects, P is a positive integer, calculate an intersection of the first pixel position set and the third pixel position set to obtain a fourth pixel position set, and fill a background pixel of the first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set to obtain a target image, where the first target pixel position set is a pixel position set of the third pixel position set other than the fourth pixel position set, and the second target pixel position set is a pixel position set of the first pixel position set that is matched with the first target pixel position set.

In this way, compared with the prior art that background pixels at positions symmetrical to the first pixel position set in the first image are used for filling, in the embodiment of the application, by extracting the third pixel position set of the P third objects in the second image of the ith frame, then calculating the intersection of the first pixel position set and the third pixel position set to obtain the fourth pixel position set, filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set to obtain the target image, so that the artifact and the blurring problem of the picture obtained by the technical scheme can be improved, and the image with better effect can be obtained.

Optionally, the processor 1010 is further configured to fill a background pixel of a first target pixel position set in the third pixel position set into a second target pixel position set in the first pixel position set to obtain a candidate target image, update the first pixel position set to be a pixel position set except the second target pixel position set in the first pixel position set, update a j second image to be the ith second image when it is determined that there is a second image that does not match with the first image in the N second images and the updated first pixel position set is not empty, update the candidate target image to the first image, and return to execute the step of filling the first pixel position set with the N second images to obtain a target image until j is a positive integer and 1+.ltoreq.j.i.

In this way, by filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set, a candidate target image is obtained, then updating the first pixel position set to be the fourth pixel position set, and under the condition that the second image of N frames is determined to exist, which is not matched with the first image, and the updated first pixel position set is not empty, updating the j second image to be the i second image, updating the candidate target image to be the first image, and returning to execute the step of filling the first pixel position set with the second image of N frames to obtain the target image until the second image of N frames is matched with the first image or the updated first pixel position set is empty.

Optionally, the processor 1010 is further configured to, after the updating the first pixel position set to the fourth pixel position set, determine a fifth pixel position set matched with the updated first pixel position set in the N-frame second image if it is determined that the N-frame second image is matched with the first image and the updated first pixel position set is not empty, and fill background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N-frame second image into the updated first pixel position set to obtain a target image.

In this way, under the condition that the N frames of second images are matched with the first image, and the updated first pixel position set is not empty, a fifth pixel position set matched with the updated first pixel position set in the N frames of second images can be determined, then background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N frames of second images are filled into the updated first pixel position set to obtain a target image, and therefore the first pixel position sets corresponding to M second objects in the first image can be completely filled, and the obtained target image is the image with the completed background filling, so that the requirements of users can be met.

Optionally, the processor 1010 is further configured to, before the matching the i-th frame second image of the N-th frame second images with the first image to obtain a second pixel position set matched with the first pixel position information set in the i-th frame second image, convert the i-th frame second image into a gray scale image to obtain a second gray scale image, match the second gray scale image with the first gray scale image to obtain a transformation matrix between the second gray scale image and the first gray scale image, convert the second gray scale image into a coordinate system corresponding to the first gray scale image based on the transformation matrix to obtain a third gray scale image, and match the third gray scale image with the first image to obtain a second pixel position set matched with the first pixel position information set in the third gray scale image.

In this way, the second gray level image is obtained by converting the second image of the ith frame into the gray level image, the second gray level image is matched with the first gray level image, a transformation matrix between the second gray level image and the first gray level image is obtained, then the second gray level image is converted into a coordinate system corresponding to the first gray level image based on the transformation matrix, a third gray level image is obtained, and the matching can be carried out on the basis of the third gray level image and the first image, so that the consistency of the picture range and the coordinate system defined by the N frames of the second image and the first image is ensured, and the accuracy of background filling of the first pixel position set in the first image is further improved.

Optionally, the processor 1010 is further configured to match features in the second gray scale image with features in the first gray scale image to obtain Q matching point pairs, and determine a transformation matrix between the second gray scale image and the first gray scale image according to the Q matching point pairs.

In this way, the Q matching point pairs are obtained by matching the features in the second gray level image with the features in the first gray level image, and the transformation matrix between the second gray level image and the first gray level image is determined according to the Q matching point pairs, so that the transformation matrix between the second gray level image and the first gray level image can be accurately obtained.

It should be appreciated that in embodiments of the present application, the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042, where the graphics processor 10041 processes image data of still pictures or video obtained by an image capturing device (e.g., a color camera) in a video capturing mode or an image capturing mode. The display unit 1006 may include a display panel 10061, and the display panel 10061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072. The touch panel 10071 is also referred to as a touch screen. The touch panel 10071 can include two portions, a touch detection device and a touch controller. Other input devices 10072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.

The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory 1009 may include both volatile and nonvolatile memory. The nonvolatile memory may be Read-only memory (ROM), programmable ROM (PROM), erasable Programmable ROM (EPROM), electrically Erasable Programmable ROM (ElectricallyEPROM, EEPROM), or flash memory. The volatile memory may be random access memory (RandomAccess Memory, RAM), static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, DDRSDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), and Direct random access memory (DRRAM). Memory 1009 in embodiments of the application includes, but is not limited to, these and any other suitable types of memory.

The processor 1010 may include one or more processing units, and optionally the processor 1010 integrates an application processor that primarily processes operations involving an operating system, user interface, application program, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 1010.

The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements the processes of the embodiment of the photographing method, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.

Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.

The embodiment of the application further provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the embodiment of the photographing method, and can achieve the same technical effects, so that repetition is avoided, and the description is omitted here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

Embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the above-described photographing method embodiment, and achieve the same technical effects, and for avoiding repetition, a detailed description is omitted herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims

1. A photographing method, characterized in that the method comprises:

In case of displaying the preview image, receiving a first input from a user;

In response to the first input, imaging the preview image to obtain a first image, wherein the first image includes a first object and M second objects;

Capturing video clips within a first time period after receiving the first input, obtaining N frames of second images, where M and N are both positive integers;

The first pixel position set is pixel-filled using background pixels at positions corresponding to the M second objects in the first image in the N frames of second images to obtain a target image.

2. The method according to claim 1, wherein the preview image includes at least one object, and before receiving the first input from the user, the method further includes:

receiving a second input from a user regarding the preview image;

In response to the second input, determining first position information of the second input;

After obtaining the first image, the method further includes:

Determining second position information corresponding to the at least one object according to the first skeleton coordinate sets corresponding to the at least one object;

The first object and M second objects are determined from the at least one object according to the distance between the first position information and the second position information respectively corresponding to the at least one object.

3. The method according to claim 2, wherein determining the first object and M second objects from the at least one object according to the distance between the first position information and the second position information respectively corresponding to the at least one object comprises:

Calculating the distance between the first position information and the second position information respectively corresponding to the at least one object to obtain the first distance respectively corresponding to the at least one object;

Determine the object corresponding to the minimum value of the first distances respectively corresponding to the at least one object as the first object;

The other objects among the at least one object except the first object are determined as the M second objects.

4. The method according to claim 2, characterized in that extracting the first pixel position set of the M second objects in the first image comprises:

Converting the first image into a grayscale image to obtain a first grayscale image;

Segmenting the foreground of the first grayscale image to obtain a first foreground image;

Based on the first foreground image and the first skeleton coordinate sets respectively corresponding to the M second objects, a first pixel position set of the M second objects in the first image is determined.

5. The method according to claim 4, characterized in that the determining the first pixel position set of the M second objects in the first image based on the first skeleton coordinate sets respectively corresponding to the first foreground image and the M second objects comprises:

Determine a foreground area corresponding to the M second objects based on the first foreground image and the first skeleton coordinate sets respectively corresponding to the M second objects;

Masks are added to pixels in other foreground areas of the first foreground image except for the foreground areas corresponding to the M second objects to obtain a first pixel position set of the M second objects in the first image.

6. The method according to claim 1, characterized in that extracting background pixels at positions corresponding to the M second objects in the first image in the N frames of second images comprises:

Matching the i-th second image in the N frames of second images with the first image to obtain a second pixel position set in the i-th second image that matches the first pixel position set, where i is a positive integer and 1≤i≤N;

The background pixels at the second pixel position set in the i-th frame of the second image are determined as the background pixels at positions in the i-th frame of the second image corresponding to the M second objects in the first image.

7. The method according to claim 6, characterized in that the step of performing pixel filling on the first pixel position set using background pixels at positions corresponding to the M second objects in the first image in the N frames of second images to obtain a target image comprises:

Extracting a third pixel position set of P third objects in the i-th frame of the second image, wherein the P third objects are not completely the same as the M second objects, and P is a positive integer;

Calculating the intersection of the first pixel position set and the third pixel position set to obtain a fourth pixel position set;

The background pixels of the first target pixel position set in the third pixel position set are filled into the second target pixel position set in the first pixel position set to obtain a target image, wherein the first target pixel position set is the pixel position set in the third pixel position set excluding the fourth pixel position set, and the second target pixel position set is the pixel position set in the first pixel position set that matches the first target pixel position set.

8. The method according to claim 7, characterized in that the step of filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set to obtain the target image comprises:

Filling the background pixels of the first target pixel position set in the third pixel position set into the second target pixel position set in the first pixel position set to obtain a candidate target image;

Updating the first pixel position set to a pixel position set in the first pixel position set excluding the second target pixel position set;

When it is determined that there is a second image that does not match the first image among the N frames of second images, and the updated first pixel position set is not empty, the j-th second image is updated to the ith second image, the candidate target image is updated to the first image, and the step of filling the first pixel position set with pixels using the N frames of second images to obtain the target image is returned to the step, until all the N frames of second images are matched with the first image or the updated first pixel position set is empty, j is a positive integer, and 1≤j≤N, j≠i.

9. The method according to claim 8, characterized in that after updating the first pixel position set to the fourth pixel position set, the method further comprises:

When it is determined that the N frames of second images are all matched with the first image and the updated first pixel position set is not empty, determining a fifth pixel position set in the N frames of second images that matches the updated first pixel position set;

The background pixels of a sixth pixel position set symmetrical to the fifth pixel position set in the N-frame second image are filled into the updated first pixel position set to obtain a target image.

10. The method according to claim 6, characterized in that before matching the i-th second image in the N frames of second images with the first image to obtain a second pixel position set in the i-th second image that matches the first pixel position information set, the method further comprises:

Convert the second image of the i-th frame into a grayscale image to obtain a second grayscale image;

Matching the second grayscale image with the first grayscale image to obtain a transformation matrix between the second grayscale image and the first grayscale image;

Based on the transformation matrix, convert the second grayscale image to a coordinate system corresponding to the first grayscale image to obtain a third grayscale image;

The matching of the i-th second image in the N-frame second image with the first image to obtain a second pixel position set in the i-th second image that matches the first pixel position information set includes:

The third grayscale image is matched with the first image to obtain a second pixel position set in the third grayscale image that matches the first pixel position information set.

11. The method according to claim 10, characterized in that the step of matching the second grayscale image with the first grayscale image to obtain a transformation matrix between the second grayscale image and the first grayscale image comprises:

Matching features in the second grayscale image with features in the first grayscale image to obtain Q matching point pairs;

A transformation matrix between the second grayscale image and the first grayscale image is determined according to the Q matching point pairs.

12. A photographing device, characterized in that the device comprises:

A first receiving module, used for receiving a first input from a user when displaying a preview image;

A first determining module is used to image the preview image in response to the first input to obtain a first image, wherein the first image includes a first object and M second objects;

A collection module, used for collecting video clips within a first time period after receiving the first input, to obtain N frames of second images, where M and N are both positive integers;

An extraction module, configured to extract a first pixel position set of the M second objects in the first image and background pixels at positions corresponding to the M second objects in the first image in the N frames of second images;

The second determination module is used to perform pixel filling on the first pixel position set using background pixels at positions corresponding to the M second objects in the first image in the N frames of second images to obtain a target image.

13. An electronic device, characterized in that it comprises a processor and a memory, wherein the memory stores a program or instruction that can be run on the processor, and when the program or instruction is executed by the processor, the steps of the photographing method according to any one of claims 1 to 11 are implemented.