CN112887623B

CN112887623B - Image generation method, device and electronic equipment

Info

Publication number: CN112887623B
Application number: CN202110121873.8A
Authority: CN
Inventors: 王康康; 张彬熠
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-01-28
Filing date: 2021-01-28
Publication date: 2022-11-29
Anticipated expiration: 2041-01-28
Also published as: CN112887623A

Abstract

The embodiment of the present application discloses an image generation method, device and electronic equipment. The method includes: receiving the user's first input on the long-exposure image. The long-exposure image and the target video are captured by the camera at the target scene at the same time; the target video includes A plurality of video frames; in response to the first input, fusion processing is performed on the first video frame in the target video to generate the first target image; wherein, the first video frame is obtained by removing the target object from multiple video frames; the target object Including: a second video frame including the first target area, or a first target area in the second video frame, the first target area being associated with the first input. According to the embodiment of the present application, the problem that it often takes a long time for a user to take a satisfactory long-exposure image can be solved.

Description

Image generation method, device and electronic equipment

技术领域technical field

本申请实施例涉及信息处理领域，尤其涉及一种图像生成方法、装置及电子设备。The embodiments of the present application relate to the field of information processing, and in particular, to an image generation method, device, and electronic equipment.

背景技术Background technique

目前，随着电子设备的摄影功能也越来越强大，由于流光快门可以自动延长快门时间，将一段时间内拍摄对象的动态变化融合到一张图像上，得到长曝光图像而广受用户欢迎。用户拍摄长曝光图像时，由于用户手持电子设备容易产生抖动，导致拍摄得到的长曝光图片的质量较差。At present, as the photography functions of electronic devices are becoming more and more powerful, streamer shutters can automatically extend the shutter time and integrate the dynamic changes of the subject within a period of time into one image to obtain long-exposure images, which are widely welcomed by users. When a user takes a long-exposure image, because the electronic device is easily shaken by the user, the quality of the obtained long-exposure image is poor.

在实现本申请过程中，申请人发现现有技术中至少存在如下问题：During the process of implementing this application, the applicant found that at least the following problems existed in the prior art:

用户为了拍摄得到满意的长曝光图像，需要拍摄多次，操作比较繁琐。In order to obtain a satisfactory long-exposure image, the user needs to take multiple shots, and the operation is cumbersome.

发明内容Contents of the invention

本申请实施例提供一种图像生成方法、装置及电子设备，能够解决用户为了拍摄得到满意的长曝光图像，需要拍摄多次，操作比较繁琐的问题。Embodiments of the present application provide an image generation method, device, and electronic equipment, which can solve the problem that a user needs to take multiple shots in order to obtain a satisfactory long-exposure image, and the operation is cumbersome.

为了解决上述技术问题，本申请是这样实现的：In order to solve the above-mentioned technical problems, the application is implemented as follows:

第一方面，本申请实施例提供了一种图像生成方法，该方法可以包括：In the first aspect, the embodiment of the present application provides an image generation method, which may include:

接收用户对长曝光图像的第一输入，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧；receiving the user's first input on the long-exposure image, the long-exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes multiple video frames;

响应于第一输入，对目标视频中的第一视频帧进行融合处理，生成第一目标图像；In response to the first input, fusion processing is performed on the first video frame in the target video to generate the first target image;

其中，第一视频帧为从多个视频帧中去除目标对象得到；目标对象包括：包括第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域，第一目标区域与第一输入相关联。Wherein, the first video frame is obtained by removing the target object from multiple video frames; the target object includes: the second video frame including the first target area, or the first target area in the second video frame, the first target area and The first input is associated.

第二方面，本申请实施例提供了一种图像生成装置，该装置可以包括：In a second aspect, the embodiment of the present application provides an image generating device, which may include:

接收模块，用于接收用户对长曝光图像的第一输入，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧；The receiving module is configured to receive the user's first input to the long-exposure image, the long-exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes a plurality of video frames;

生成模块，用于响应于第一输入，对目标视频中的第一视频帧进行融合处理，生成第一目标图像；A generating module, configured to perform fusion processing on the first video frame in the target video to generate the first target image in response to the first input;

第三方面，本申请实施例提供了一种电子设备，该电子设备包括处理器、存储器及存储在存储器上并可在处理器上运行的程序或指令，程序或指令被处理器执行时实现如第一方面的方法的步骤。In the third aspect, the embodiment of the present application provides an electronic device, the electronic device includes a processor, a memory, and a program or instruction stored in the memory and operable on the processor. When the program or instruction is executed by the processor, the The steps of the method of the first aspect.

第四方面，本申请实施例提供了一种可读存储介质，可读存储介质上存储程序或指令，程序或指令被处理器执行时实现如第一方面的方法的步骤。In a fourth aspect, the embodiment of the present application provides a readable storage medium, on which a program or instruction is stored, and when the program or instruction is executed by a processor, the steps of the method in the first aspect are implemented.

第五方面，本申请实施例提供了一种芯片，芯片包括处理器和通信接口，通信接口和处理器耦合，处理器用于运行程序或指令，实现如第一方面的方法。In the fifth aspect, the embodiment of the present application provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the method in the first aspect.

本申请实施例中，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行融合处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。In the embodiment of the present application, the first target image is generated by synthesizing the first video frame obtained by removing the target object from the video frames of the target video in response to the user's first input of the long exposure image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing fusion processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of shooting long exposure images and reduces Number of shots.

附图说明Description of drawings

从下面结合附图对本申请的具体实施方式的描述中可以更好地理解本申请其中，相同或相似的附图标记表示相同或相似的特征。The present application can be better understood from the following description of specific embodiments of the present application in conjunction with the accompanying drawings, wherein the same or similar reference numerals represent the same or similar features.

图1为本申请实施例提供的图像生成方法的一种应用场景示意图；FIG. 1 is a schematic diagram of an application scenario of an image generation method provided in an embodiment of the present application;

图2为本申请实施例提供的一种图像生成方法的流程图；FIG. 2 is a flow chart of an image generation method provided by an embodiment of the present application;

图3为本申请实施例提供的一种用于显示长曝光图像和目标视频的界面示意图；FIG. 3 is a schematic diagram of an interface for displaying long-exposure images and target videos provided by an embodiment of the present application;

图4为本申请实施例提供的一种用于显示第二目标区域的界面示意图；FIG. 4 is a schematic diagram of an interface for displaying a second target area provided by an embodiment of the present application;

图5为本申请实施例提供的一种用于显示第一目标区域的界面示意图；FIG. 5 is a schematic diagram of an interface for displaying a first target area provided by an embodiment of the present application;

图6为本申请实施例提供的一种用于显示第二标识的界面示意图；FIG. 6 is a schematic diagram of an interface for displaying a second logo provided by an embodiment of the present application;

图7为本申请实施例提供的一种用于显示目标时间段的界面示意图；FIG. 7 is a schematic diagram of an interface for displaying a target time period provided by an embodiment of the present application;

图8为本申请实施例提供的一种用于显示去除第一目标区域的界面示意图；FIG. 8 is a schematic diagram of an interface for displaying removal of the first target area provided by the embodiment of the present application;

图9为本申请实施例提供的一种用于显示融合第三目标区域的界面示意图；FIG. 9 is a schematic diagram of an interface for displaying and merging a third target area provided by an embodiment of the present application;

图10为本申请实施例提供的一种图像生成装置结构示意图；FIG. 10 is a schematic structural diagram of an image generation device provided by an embodiment of the present application;

图11为本申请实施例提供的一种电子设备的硬件结构示意图；FIG. 11 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application;

图12为本申请实施例提供的另一种电子设备的硬件结构示意图。FIG. 12 is a schematic diagram of a hardware structure of another electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象，而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，说明书以及权利要求中“和/或”表示所连接对象的至少其中之一，字符“/”，一般表示前后关联对象是一种“或”的关系。The terms "first", "second" and the like in the specification and claims of the present application are used to distinguish similar objects, and are not used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application can be practiced in sequences other than those illustrated or described herein. In addition, "and/or" in the specification and claims means at least one of the connected objects, and the character "/" generally means that the related objects are an "or" relationship.

本申请实施例提供的图像生成方法至少可以应用于下述应用场景中，下面进行说明。The image generation method provided in the embodiment of the present application can be applied to at least the following application scenarios, which will be described below.

目前，随着电子设备的不断发展，多数电子设备可以提供与流光快门相关的拍摄功能。流光快门可以捕捉动态对象的运动轨迹，并自动延长快门时间，可以拍出流光溢彩、美轮美奂的特殊光影照片。At present, with the continuous development of electronic equipment, most electronic equipment can provide shooting functions related to the streamer shutter. The Ambilight Shutter can capture the movement trajectory of dynamic objects and automatically extend the shutter time, allowing you to take pictures with radiant and beautiful special light and shadow.

因为流光快门的本质是将一段时间内的动态对象的变化融合到一张长曝光图像上，所以拍摄一张长曝光图像必然需要消耗一定的时间。用户拍摄长曝光图像时，由于用户手持电子设备容易产生抖动，导致拍摄得到的长曝光图片的质量较差。用户为了了拍摄得到满意的长曝光图像，需要重新消耗时间进行二次拍摄，甚至多次拍摄，操作比较繁琐。Because the essence of the streamer shutter is to fuse the changes of dynamic objects within a period of time into a long exposure image, it will inevitably take a certain amount of time to shoot a long exposure image. When a user takes a long-exposure image, because the electronic device is easily shaken by the user, the quality of the obtained long-exposure image is poor. In order to obtain a satisfactory long-exposure image, the user needs to consume time for secondary shooting, or even multiple shooting, and the operation is cumbersome.

如图1所示，在长曝光图像中，包括用户期望出现在长曝光图像中的拍摄对象10，也包括杂质对象20，其中杂质对象20是用户不期望出现在长曝光图像中的。比如，用户想要拍摄一张能够体现“车水马龙”的长曝光图像，那么马路上行驶的车辆即为用户期望出现在长曝光图像中的拍摄对象10，马路边路过的行人即为杂质对象20。如果用户想得到一张不包含杂质对象20的长曝光图像，只有重新拍摄。但是重新拍摄的过程中，可能还会有行人路过，导致拍摄出一张用户满意的长曝光图像的成功率很低，操作繁琐。As shown in FIG. 1 , the long exposure image includes the subject 10 that the user expects to appear in the long exposure image, and also includes impurity objects 20 , where the impurity object 20 is not expected by the user to appear in the long exposure image. For example, if the user wants to take a long-exposure image that can reflect "heavy traffic", then the vehicle driving on the road is the subject 10 that the user expects to appear in the long-exposure image, and the pedestrian passing by the road is the impurity object 20 . If the user wants to obtain a long-exposure image that does not contain the impurity object 20, only re-shooting is required. However, during the re-shooting process, there may still be pedestrians passing by, resulting in a low success rate of shooting a long-exposure image satisfactory to the user, and the operation is cumbersome.

针对相关技术出现的问题，本申请实施例提供一种图像生成方法、装置、电子设备及存储介质，能够解决相关技术中用户为了拍摄得到满意的长曝光图像，需要拍摄多次，操作比较繁琐的问题。In view of the problems in related technologies, the embodiments of the present application provide an image generation method, device, electronic equipment and storage medium, which can solve the problems in related technologies that users need to take multiple shots in order to obtain satisfactory long-exposure images, and the operation is relatively cumbersome. question.

本申请实施例提供的方法，除了可以应用到上述应用场景之外，还可以应用到任何拍摄出的长曝光图像质量较差的场景中。In addition to being applicable to the above application scenarios, the method provided in the embodiment of the present application may also be applied to any scene where the captured long-exposure image quality is poor.

通过本申请实施例提供的方法，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行合成处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。Through the method provided by the embodiment of the present application, the first target image is generated by synthesizing the first video frame obtained by removing the target object from the video frames of the target video in response to the user's first input of the long exposure image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing synthesis processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of taking long exposure images and reduces Number of shots.

基于上述应用场景，下面对本申请实施例提供的图像生成方法进行详细说明。Based on the above application scenarios, the image generation method provided by the embodiment of the present application will be described in detail below.

图2为本申请实施例提供的一种图像生成方法的流程图。FIG. 2 is a flow chart of an image generation method provided by an embodiment of the present application.

如图2所示，该图像生成方法可以包括步骤210-步骤220，该方法应用于图像生成装置，具体如下所示：As shown in Figure 2, the image generation method may include steps 210-step 220, the method is applied to an image generation device, specifically as follows:

步骤210，接收用户对长曝光图像的第一输入，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧。Step 210, receiving the user's first input on the long exposure image, the long exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes multiple video frames.

步骤220，响应于第一输入，对目标视频中的第一视频帧进行合成处理，生成第一目标图像；其中，第一视频帧为从多个视频帧中去除目标对象得到；目标对象包括：包括第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域，第一目标区域与第一输入相关联。Step 220, in response to the first input, synthesize the first video frame in the target video to generate a first target image; wherein, the first video frame is obtained by removing the target object from multiple video frames; the target object includes: A second video frame including the first target area, or a first target area in the second video frame, the first target area being associated with the first input.

本申请实施例提供的图像生成方法中，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行合成处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。In the image generation method provided by the embodiment of the present application, the first target image is generated by synthesizing the first video frame obtained by removing the target object from the video frames of the target video in response to the user's first input of the long exposure image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing synthesis processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of taking long exposure images and reduces Number of shots.

下面，对步骤210-步骤220的内容分别进行描述：Below, the contents of step 210-step 220 are described respectively:

首先，涉及步骤210。First, step 210 is involved.

其中，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧。Wherein, the long-exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes multiple video frames.

其中，在上述涉及到的接收用户对长曝光图像的第一输入的步骤之前还可以包括以下步骤：Wherein, the following steps may also be included before the above-mentioned step of receiving the user's first input on the long-exposure image:

在摄像头的取景框显示目标场景的情况下，接收用户的拍摄输入；响应于拍摄输入，拍摄目标场景的长曝光图像以及目标视频。When the viewfinder frame of the camera displays the target scene, a user's shooting input is received; in response to the shooting input, a long-exposure image of the target scene and a target video are captured.

即长曝光图像和目标视频可以为同一摄像头对用一目标场景同时拍摄得到，目标视频记录了长曝光图像中的动态信息和静态信息。That is, the long-exposure image and the target video can be captured simultaneously by the same camera pair using a target scene, and the target video records the dynamic information and static information in the long-exposure image.

如图3所示，响应于拍摄输入，同时对目标场景进行拍摄，得到相关联的长曝光图像以及目标视频，长曝光图像保留了在拍摄时间内目标场景的动态信息(如：行驶中的车辆、行人和光轨等对应的图像区域)和静态信息(如：建筑物等对应的图像区域)，目标视频也保留了在拍摄时间内目标场景的动态信息和静态信息。As shown in Figure 3, in response to the shooting input, the target scene is shot at the same time, and the associated long exposure image and target video are obtained. The long exposure image retains the dynamic information of the target scene within the shooting time (such as: a moving vehicle , pedestrians and light tracks, etc.) and static information (such as: image areas corresponding to buildings, etc.), the target video also retains the dynamic information and static information of the target scene within the shooting time.

其中，长曝光图像中包括目标场景的动态信息和静态信息，动态信息用于指示目标场景中的动态对象对应的图像区域，静态信息用于指示目标场景中的静态对象对应的图像区域，上述涉及到的对目标视频中的第一视频帧进行融合处理，生成第一目标图像的步骤中，具体可以包括以下步骤：Wherein, the long exposure image includes dynamic information and static information of the target scene, the dynamic information is used to indicate the image area corresponding to the dynamic object in the target scene, and the static information is used to indicate the image area corresponding to the static object in the target scene. In the step of performing fusion processing on the first video frame in the target video to generate the first target image, it may specifically include the following steps:

识别第一视频帧中的动态对象以及静态对象；从第一视频帧中提取动态对象的动态信息，以及静态对象的静态信息；对多个第一视频帧中的动态信息进行融合处理，生成目标动态信息；对静态信息和目标运动信息进行融合处理，生成第一目标图像。Identify the dynamic object and static object in the first video frame; extract the dynamic information of the dynamic object and the static information of the static object from the first video frame; perform fusion processing on the dynamic information in multiple first video frames to generate a target dynamic information; performing fusion processing on static information and target motion information to generate a first target image.

其中，动态对象可以包括：行驶中的车辆、行人和光轨。静态对象可以包括建筑物、马路等不动的静止物体。通过识别并保留静态对象的静态信息，提取每张第一视频帧中的动态对象的动态信息合成动态效果(及目标动态信息)，最终呈现出动态模糊，或者是灯光轨迹等效果的第一目标图像。Among them, the dynamic objects may include: moving vehicles, pedestrians and light tracks. Static objects may include immobile stationary objects such as buildings and roads. By identifying and retaining the static information of the static object, extracting the dynamic information of the dynamic object in each first video frame to synthesize the dynamic effect (and target dynamic information), and finally presenting the first target of motion blur, or light trail and other effects image.

然后，涉及步骤220。Then, step 220 is involved.

响应于第一输入，对目标视频中的第一视频帧进行合成处理，生成第一目标图像；其中，第一视频帧为从多个视频帧中去除目标对象得到。In response to the first input, a first video frame in the target video is synthesized to generate a first target image; wherein, the first video frame is obtained by removing the target object from multiple video frames.

其中，目标对象包括：包括第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。Wherein, the target object includes: the second video frame including the first target area, or the first target area in the second video frame.

这里，对第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域从多个视频帧中去除，进而可以通过去除后得到的第一视频帧来生成一张更完美的第一目标图像。Here, the second video frame of the first target area, or the first target area in the second video frame is removed from multiple video frames, and then a more perfect picture can be generated by the first video frame obtained after removal first target image.

其中，第一目标区域与第一输入相关联。第一输入可以用于指示长曝光图像中的第一目标区域，即用户期望去除的区域；第一输入也可以用于指示长曝光图像中的第二目标区域，即用户期望保留的区域，即在长曝光图像中除了第二目标区域以外的区域都可以是第一目标区域。其中，上述涉及到的第一输入用于指示长曝光图像中的第二目标区域，第一目标区域与第二目标区域不重合。Wherein, the first target area is associated with the first input. The first input can be used to indicate the first target area in the long-exposure image, that is, the area that the user wants to remove; the first input can also be used to indicate the second target area in the long-exposure image, that is, the area that the user wants to keep, that is In the long-exposure image, all regions except the second target region may be the first target region. Wherein, the first input mentioned above is used to indicate the second target area in the long exposure image, and the first target area does not overlap with the second target area.

第一输入用于指示长曝光图像中的第二目标区域，第二目标区域可以包括用户期望出现在长曝光图像中的拍摄对象。The first input is used to indicate a second target area in the long exposure image, and the second target area may include a subject that the user expects to appear in the long exposure image.

如图4所示，长曝光图像中包括一个类似圆形的光影轨迹，该长曝光图像中还包含一些其他杂质区域。由于第二目标区域可以包括用户期望保留在长曝光图像中的拍摄对象，相应地，第一目标区域可以包括用户不期望出现在长曝光图像中的杂质对象。As shown in FIG. 4 , the long-exposure image includes a light-shadow trajectory similar to a circle, and the long-exposure image also includes some other impurity regions. Since the second target area may include objects that the user expects to remain in the long-exposure image, correspondingly, the first target area may include foreign objects that the user does not expect to appear in the long-exposure image.

首先，响应于用户对第二目标区域的第一输入，识别长曝光图像除了第二目标区域以外的区域；然后，从多个视频帧中去除目标对象得到第一视频帧；最后，对目标视频中的第一视频帧进行合成处理，生成第一目标图像。First, in response to the user's first input on the second target area, identify the area of the long exposure image except the second target area; then, remove the target object from multiple video frames to obtain the first video frame; finally, the target video The first video frame in is synthesized to generate the first target image.

这里，通过将包括第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域从多个视频帧中去除，进而可以通过去除后得到的第一视频帧来生成一张更符合用户期望的第一目标图像。Here, by removing the second video frame including the first target area, or the first target area in the second video frame from multiple video frames, a more detailed image can be generated by removing the first video frame obtained The first target image that matches the user's expectations.

在一种可能的实施例中，第一输入用于指示第一目标区域，在上述涉及到的对目标视频中的第一视频帧进行融合处理的步骤之前，还可以包括以下步骤：In a possible embodiment, the first input is used to indicate the first target area, and before the above-mentioned step of performing fusion processing on the first video frame in the target video, the following steps may also be included:

识别目标视频中包括第一目标区域的第二视频帧；从视频帧中去除第二视频帧，得到第一视频帧。Identifying a second video frame including the first target area in the target video; removing the second video frame from the video frame to obtain the first video frame.

响应于第一输入，对某些包括第一目标区域的第二视频帧进行删除，然后对剩下的第一视频帧重新进行融合，生成一张更完美的长曝光图像。如图5所示，长曝光图像中包括一个类似圆形的光影轨迹，该长曝光图像中还包含一些其他的杂质对象。其中，第一目标区域可以包括用户不期望出现在长曝光图像中的杂质对象所在的区域。In response to the first input, some second video frames including the first target area are deleted, and then the remaining first video frames are re-fused to generate a more perfect long-exposure image. As shown in FIG. 5 , the long-exposure image includes a light-shadow trajectory similar to a circle, and the long-exposure image also includes some other impurity objects. Wherein, the first target area may include an area where impurity objects that the user does not expect to appear in the long-exposure image are located.

首先，响应于用户对第一目标区域的第一输入，识别长曝光图像的第一目标区域；然后，从多个视频帧中去除包括第一目标区域的第二视频帧，得到第一视频帧；最后，对目标视频中的第一视频帧进行合成处理，生成第一目标图像。First, in response to a user's first input on the first target area, a first target area of the long-exposure image is identified; then, a second video frame including the first target area is removed from a plurality of video frames to obtain the first video frame ; Finally, the first video frame in the target video is synthesized to generate the first target image.

这里，通过将包括第一目标区域的第二视频帧从多个视频帧中去除，进而可以通过去除第二视频帧后得到的第一视频帧来生成一张更符合用户期望的第一目标图像。Here, by removing the second video frame including the first target area from multiple video frames, a first target image that is more in line with user expectations can be generated by removing the first video frame obtained after the second video frame .

在一种可能的实施例中，上述涉及到的从视频帧中去除第二视频帧，得到第一视频帧的步骤中，具体可以包括以下步骤：In a possible embodiment, the step of removing the second video frame from the video frame to obtain the first video frame may specifically include the following steps:

在目标视频的时间轴上显示第一标识，第一标识用于指示包括第一目标区域的第三视频帧，第三视频帧包括第二视频帧；接收用户对第一标识中的第二标识的第二输入，第二标识与第二视频帧相对应；响应于第二输入，从视频帧中去除第二视频帧，得到第一视频帧。Display the first mark on the time axis of the target video, the first mark is used to indicate the third video frame including the first target area, the third video frame includes the second video frame; receive the user's second mark in the first mark For the second input, the second identifier corresponds to the second video frame; in response to the second input, the second video frame is removed from the video frame to obtain the first video frame.

在目标视频的时间轴上显示第一标识，以用于用户可以从中选择删除哪些第二视频帧。如图6所示，目标视频的时间轴上包括多个第一标识，第一标识用于指示包括第一目标区域的第三视频帧，第二标识与第二视频帧相对应。第一标识包括第二标识，相应地，第三视频帧包括第二视频帧。这时，可以响应于用户对第二标识的第二输入，从视频帧中去除第二视频帧，得到第一视频帧。The first identifier is displayed on the time axis of the target video, for the user to select which second video frames to delete. As shown in FIG. 6 , the time axis of the target video includes a plurality of first markers, the first markers are used to indicate the third video frame including the first target area, and the second markers correspond to the second video frame. The first identifier includes the second identifier, and correspondingly, the third video frame includes the second video frame. At this time, in response to the user's second input of the second identifier, the second video frame may be removed from the video frame to obtain the first video frame.

这里，通过检测出包括第一目标区域的第三视频帧，能够快速确定包含“杂质对象”的第三视频帧，并在时间轴上显示与第三视频帧相对应的第一标识，以供用户从第一标识中选择对应第二视频帧的第二标识，进而可以将包括第一目标区域的第二视频帧从多个视频帧中去除，进而可以通过去除第二视频帧后得到的第一视频帧来生成一张更符合用户期望的第一目标图像。Here, by detecting the third video frame including the first target area, the third video frame containing the "impurity object" can be quickly determined, and the first mark corresponding to the third video frame is displayed on the time axis for The user selects the second identifier corresponding to the second video frame from the first identifier, and then the second video frame including the first target area can be removed from the plurality of video frames, and then the second video frame obtained after removing the second video frame can be A video frame is used to generate a first target image that is more in line with user expectations.

在另一种可能的实施例中，上述涉及到的从视频帧中去除第二视频帧，得到第一视频帧的步骤中，具体可以包括以下步骤：In another possible embodiment, the above-mentioned step of removing the second video frame from the video frame to obtain the first video frame may specifically include the following steps:

显示包括时间轴的目标视频；接收用户对时间轴中的目标时间段的第三输入；目标时间段对应的视频片段包括第二视频帧；响应于第三输入，从视频帧中去除第二视频帧，得到第一视频帧。Displaying the target video including the time axis; receiving the third input from the user to the target time period in the time axis; the video segment corresponding to the target time period includes the second video frame; in response to the third input, removing the second video from the video frame frame to get the first video frame.

接收用户对目标时间段的第三输入，第三输入可以包括时间轴中的目标时间段的起始进度和结束进度，如图7中的两条竖线，两根竖线中间的区域即为想去掉的时间段。响应于第三输入，可以把该时间段内包括的第二视频帧去掉。Receive the user's third input on the target time period. The third input may include the start progress and end progress of the target time period in the time axis, as shown in the two vertical lines in Figure 7. The area between the two vertical lines is The time period you want to remove. In response to the third input, the second video frame included in the time period may be removed.

这里，通过显示包括时间轴的目标视频，响应于用户对时间轴中的目标时间段的第三输入，从视频帧中去除目标时间段对应的第二视频帧，得到第一视频帧。可以使用户自行选择需要去除的视频帧，进而可以将包括第一目标区域的第二视频帧从多个视频帧中去除，并通过去除第二视频帧后得到的第一视频帧来生成一张更符合用户期望的第一目标图像。Here, by displaying the target video including the time axis, in response to the user's third input of the target time period in the time axis, the second video frame corresponding to the target time period is removed from the video frame to obtain the first video frame. The user can select the video frame to be removed by himself, and then the second video frame including the first target area can be removed from multiple video frames, and a first video frame obtained after removing the second video frame can be used to generate a The first target image that is more in line with user expectations.

作为本申请的一种实现方式，目标对象为第二视频帧中的第一目标区域，为了提升长曝光图像的合成效果，在上述涉及到的对目标视频中的第一视频帧进行融合处理的步骤之前，还可以包括以下步骤：As an implementation of the present application, the target object is the first target area in the second video frame. In order to improve the synthesis effect of the long-exposure image, in the above-mentioned fusion processing of the first video frame in the target video Before the step, you can also include the following steps:

响应于第一输入，识别目标视频中包括第一目标区域的第二视频帧，以及不包括第一目标区域的第四视频帧；从第二视频帧中去除第一目标区域，得到第一视频帧；In response to the first input, identifying a second video frame including the first target area in the target video, and a fourth video frame not including the first target area; removing the first target area from the second video frame to obtain the first video frame;

相应地，对目标视频中的第一视频帧进行合成处理，包括：Correspondingly, the first video frame in the target video is synthesized, including:

对第一视频帧和第四视频帧进行融合处理，生成第一目标图像。Fusion processing is performed on the first video frame and the fourth video frame to generate a first target image.

其中，上述涉及到的第一目标区域可以为响应于第一输入，电子设备自动识别出来的，也可以为用户在长曝光视频中选中的第一目标区域。Wherein, the first target area mentioned above may be automatically recognized by the electronic device in response to the first input, or may be the first target area selected by the user in the long-exposure video.

示例性地，如图8所示，长曝光图像中包括一个类似圆形的光影轨迹，长曝光图像中同时也包括一些用户不希望保留在长曝光图像中的第一目标区域。用户可以在长曝光视频中选取第一目标区域，接着响应于用户对“静态”按钮的输入，电子设备就会将第一目标区域为静态区域，不对其进行融合。最后响应于用户对“融合”按钮的输入，便会生成不包含第一目标区域的第一目标图像。如图8中的右图所示。Exemplarily, as shown in FIG. 8 , the long-exposure image includes a light-shadow trajectory similar to a circle, and the long-exposure image also includes some first target regions that the user does not want to keep in the long-exposure image. The user may select the first target area in the long-exposure video, and then in response to the user's input to the "Static" button, the electronic device will regard the first target area as a static area without fusion. Finally, in response to the user's input of the "fusion" button, a first target image that does not include the first target region is generated. As shown in the right figure in Figure 8.

这里，通过将从第二视频帧中去除第一目标区域得到的第一视频帧和第四视频帧进行合成处理，生成第一目标图像。由于能够第四视频帧本身就不包括第一目标区域，而且，第一视频帧是从第二视频帧中去除了第一目标区域得到的，所以第一视频帧也不包括第一目标区域。由此，基于第一视频帧和第四视频帧进行合成处理生成的第一目标图像，也不包含第一目标区域，符合用户期望。Here, the first target image is generated by synthesizing the first video frame obtained by removing the first target region from the second video frame and the fourth video frame. Since the fourth video frame itself does not include the first target area, and the first video frame is obtained by removing the first target area from the second video frame, the first video frame also does not include the first target area. Therefore, the first target image generated based on the synthesis processing of the first video frame and the fourth video frame does not include the first target area, which meets user expectations.

在又一种可能的实施例中，接收用户对长曝光图像的第四输入，第四输入用于指示长曝光图像中的第三目标区域；响应于第四输入，将第三目标区域与多个视频帧进行融合，得到融合后的目标视频；第一视频帧为融合后的目标视频中的视频帧。In yet another possible embodiment, a fourth input from the user on the long-exposure image is received, and the fourth input is used to indicate a third target area in the long-exposure image; in response to the fourth input, combine the third target area with the multiple The first video frame is the video frame in the fused target video.

为了满足用户可能想将某些特殊的光影区域加入长曝光图像中的需求，可以响应于用户选取第三目标区域的第四输入；将第三目标区域与多个视频帧进行融合，得到融合后的目标视频，然后根据融合后的目标视频中的第一视频帧合成第一目标图像。In order to meet the needs of users who may want to add some special light and shadow areas to long-exposure images, the fourth input of the third target area can be selected in response to the user; the third target area is fused with multiple video frames to obtain the fused The target video, and then synthesize the first target image according to the first video frame in the fused target video.

示例性地，如图9所示，可以响应于用户对选取“小星星”(第三目标区域)的第四输入，然后响应于用户对“动态”按钮的输入，电子设备就会将第三目标区域确定为动态区域，将“小星星”应用到视频帧中，即将第三目标区域与多个视频帧进行融合，得到融合后的目标视频。最后响应于用户对“融合”按钮的输入，基于融合后的目标视频中的第一视频帧，生成第一目标图像，为图像合成提供更多可能。Exemplarily, as shown in FIG. 9 , in response to the user's fourth input of selecting "little star" (the third target area), and then in response to the user's input of the "Dynamic" button, the electronic device will display the third The target area is determined as a dynamic area, and the "little star" is applied to the video frame, that is, the third target area is fused with multiple video frames to obtain the fused target video. Finally, in response to the user's input of the "fusion" button, the first target image is generated based on the first video frame in the fused target video, providing more possibilities for image synthesis.

由此，通过响应于用户选取第三目标区域的第四输入；将第三目标区域与多个视频帧进行融合，得到融合后的目标视频，然后对融合后的目标视频中的第一视频帧进行融合处理，生成满足用户合成需求的第一目标图像。Thus, by responding to the fourth input of the user selecting the third target area; the third target area is fused with a plurality of video frames to obtain the fused target video, and then the first video frame in the fused target video is Fusion processing is performed to generate the first target image that satisfies the user's synthesis requirements.

其中，上述涉及到的响应于第四输入，将所述第三目标区域与所述多个视频帧进行融合，得到融合后的目标视频的步骤中，具体可以包括以下步骤：Wherein, the step of merging the third target area and the plurality of video frames in response to the fourth input mentioned above to obtain the fused target video may specifically include the following steps:

响应于第四输入，识别目标视频中的目标动态对象；根据目标动态对象在视频帧中的位置，确定目标动态对象在目标视频中的移动轨迹；根据移动轨迹，将第三目标区域与多个视频帧进行融合，得到融合后的目标视频。In response to the fourth input, identify the target dynamic object in the target video; determine the moving track of the target dynamic object in the target video according to the position of the target dynamic object in the video frame; according to the moving track, combine the third target area with multiple The video frames are fused to obtain the fused target video.

可选地，除上述涉及到的将第三目标区域与多个视频帧进行融合，得到融合后的目标视频，进而可以基于融合后的目标视频中的第一视频帧生成第一目标图像的实施方式之外，还可以包括下述实施方式：响应于第四输入，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧；对目标视频中的第一视频帧进行融合处理，生成第一目标图像，包括：对融合后的第一视频帧进行融合处理，生成第一目标图像。Optionally, in addition to the above-mentioned implementation of fusing the third target area with multiple video frames to obtain the fused target video, the first target image can be generated based on the first video frame in the fused target video In addition to the manner, the following implementation manners may also be included: in response to the fourth input, the third target area is fused with the first video frame to obtain the fused first video frame; the first video frame in the target video is fused The fusion processing to generate the first target image includes: performing fusion processing on the fused first video frame to generate the first target image.

为了满足用户可能想将某些特殊的光影区域加入长曝光图像中的需求，可以响应于用户选取第三目标区域的第四输入；将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧；相应地，上述涉及到对目标视频中的第一视频帧进行融合处理，生成第一目标图像的步骤中，具体包括：对融合后的第一视频帧进行融合处理，生成第一目标图像。In order to meet the user's need to add some special light and shadow areas to the long-exposure image, the fourth input of the third target area can be selected in response to the user; the third target area is fused with the first video frame to obtain the fused Correspondingly, in the step of performing fusion processing on the first video frame in the target video and generating the first target image, it specifically includes: performing fusion processing on the first video frame after fusion to generate first target image.

示例性地，如图9所示，可以响应于用户对选取“小星星”(第三目标区域)的第四输入，然后响应于用户对“动态”按钮的输入，电子设备就会将第三目标区域确定为动态区域，将“小星星”应用到第一视频帧中，得到融合后的第一视频帧。最后响应于用户对“融合”按钮的输入，对融合后的第一视频帧进行融合处理，生成第一目标图像，为图像合成提供更多可能。Exemplarily, as shown in FIG. 9 , in response to the user's fourth input of selecting "little star" (the third target area), and then in response to the user's input of the "Dynamic" button, the electronic device will display the third The target area is determined as a dynamic area, and the "little star" is applied to the first video frame to obtain the fused first video frame. Finally, in response to the user's input of the "fusion" button, the fused first video frame is fused to generate the first target image, which provides more possibilities for image synthesis.

由此，通过响应于用户选取第三目标区域的第四输入，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧；对融合后的第一视频帧进行融合处理，生成满足用户合成需求的第一目标图像。Thus, by responding to the fourth input of selecting the third target area by the user, the third target area is fused with the first video frame to obtain a fused first video frame; performing fusion processing on the fused first video frame , to generate the first target image that meets the user's synthesis requirements.

其中，上述涉及到的响应于第四输入，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧的步骤中，具体可以包括以下步骤：Wherein, in the step of merging the third target area and the first video frame to obtain the fused first video frame in response to the fourth input mentioned above, the following steps may be specifically included:

响应于第四输入，识别目标视频中的目标动态对象；根据目标动态对象在视频帧中的位置，确定目标动态对象在目标视频中的移动轨迹；根据移动轨迹，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧。In response to the fourth input, identify the target dynamic object in the target video; determine the moving track of the target dynamic object in the target video according to the position of the target dynamic object in the video frame; according to the moving track, combine the third target area with the first The video frames are fused to obtain the fused first video frame.

示例性地，用户拍摄了一张高速公路上的长曝光图像，公路上行驶中的目标车辆可以为目标视频中的目标动态对象，长曝光图像也包含“小星星”，这时，可以响应于第四输入，根据“目标车辆”在视频帧中的位置，确定“目标车辆”在目标视频的拍摄过程中的移动轨迹；根据移动轨迹，将“小星星”模拟移动轨迹与第一视频帧进行融合，得到融合后的第一视频帧。For example, the user takes a long-exposure image on a highway. The target vehicle driving on the highway may be the target dynamic object in the target video, and the long-exposure image also contains "small stars". At this time, you can respond to The fourth input, according to the position of the "target vehicle" in the video frame, determine the moving track of the "target vehicle" during the shooting process of the target video; according to the moving track, compare the simulated moving track of the "little star" with the first video frame Fusion to obtain the first video frame after fusion.

由此，“小星星”具有与车辆行驶光影类似的移动轨迹，基于第一视频帧生成的第一目标图像包括“小星星”快速移动形成的光影轨迹，形成美轮美奂的第一目标图像。Thus, the "little star" has a moving track similar to the light and shadow of the vehicle, and the first target image generated based on the first video frame includes the light and shadow track formed by the fast movement of the "little star", forming a beautiful first target image .

另外，第三目标区域可以为上述涉及到的长曝光图像以外的图像中的内容，为用户提供更多图像合成的可能性。In addition, the third target area may be content in images other than the above-mentioned long-exposure images, providing users with more possibilities for image synthesis.

可以理解地，通过第一输入和第四输入，不仅可以去除目标对象，而且可以同时融合第三目标区域的信息，得到第一目标图像。Understandably, through the first input and the fourth input, not only the target object can be removed, but also the information of the third target area can be fused simultaneously to obtain the first target image.

综上，在本申请实施例中，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行合成处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。To sum up, in the embodiment of the present application, the first target image is generated by synthesizing the first video frame obtained by removing the target object from the video frames of the target video in response to the user's first input of the long exposure image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing synthesis processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of taking long exposure images and reduces Number of shots.

需要说明的是，本申请实施例提供的图像生成方法，执行主体可以为图像生成装置，或者该图像生成装置中的用于执行加载图像生成方法的控制模块。本申请实施例中以图像生成装置执行加载图像生成方法为例，说明本申请实施例提供的图像生成方法。It should be noted that, for the image generation method provided in the embodiment of the present application, the execution subject may be an image generation device, or a control module in the image generation device for executing the loading image generation method. In the embodiment of the present application, the method for generating a loaded image executed by the image generating device is taken as an example to illustrate the image generating method provided in the embodiment of the present application.

另外，基于上述图像生成方法，本申请实施例还提供了一种图像生成装置，具体结合图10进行详细说明。In addition, based on the above image generation method, an embodiment of the present application further provides an image generation device, which will be described in detail with reference to FIG. 10 .

图10为本申请实施例提供的一种图像生成装置结构示意图。FIG. 10 is a schematic structural diagram of an image generating device provided by an embodiment of the present application.

如图10所示，该图像生成装置1000可以包括：As shown in Figure 10, the image generating device 1000 may include:

接收模块1010，用于接收用户对长曝光图像的第一输入，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧。The receiving module 1010 is configured to receive the user's first input on the long exposure image, the long exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes multiple video frames.

生成模块1020，用于响应于第一输入，对目标视频中的第一视频帧进行融合处理，生成第一目标图像。The generation module 1020 is configured to perform fusion processing on the first video frame in the target video to generate a first target image in response to the first input.

其中，第一视频帧为从多个视频帧中去除目标对象得到；目标对象包括：包括第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域，第一目标区域与第一输入相关联；Wherein, the first video frame is obtained by removing the target object from multiple video frames; the target object includes: the second video frame including the first target area, or the first target area in the second video frame, the first target area and The first input is associated;

第一输入用于指示第一目标区域，或者，第一输入用于指示长曝光图像中的第二目标区域，第一目标区域是长曝光图像中除第二目标区域之外的至少部分区域。The first input is used to indicate the first target area, or the first input is used to indicate the second target area in the long exposure image, and the first target area is at least a partial area in the long exposure image except the second target area.

在一种可能的实施例中，长曝光图像中包括目标场景的动态信息和静态信息，动态信息用于指示目标场景中的动态对象对应的图像区域，静态信息用于指示目标场景中的静态对象对应的图像区域，生成模块1020，包括：In a possible embodiment, the long exposure image includes dynamic information and static information of the target scene, the dynamic information is used to indicate the image area corresponding to the dynamic object in the target scene, and the static information is used to indicate the static object in the target scene The corresponding image area, generating module 1020, includes:

识别模块，用于识别第一视频帧中的动态对象以及静态对象。The identification module is used to identify dynamic objects and static objects in the first video frame.

提取模块，用于从第一视频帧中提取动态对象的动态信息，以及静态对象的静态信息。The extraction module is used to extract the dynamic information of the dynamic object and the static information of the static object from the first video frame.

融合模块，用于对多个第一视频帧中的动态信息进行融合处理，生成目标动态信息。The fusion module is configured to perform fusion processing on dynamic information in multiple first video frames to generate target dynamic information.

融合模块，还用于对静态信息和目标运动信息进行融合处理，生成第一目标图像。The fusion module is further configured to perform fusion processing on the static information and the target motion information to generate the first target image.

在一种可能的实施例中，第一输入用于指示第一目标区域，识别模块，还用于识别目标视频中包括第一目标区域的第二视频帧。In a possible embodiment, the first input is used to indicate the first target area, and the identification module is further used to identify a second video frame including the first target area in the target video.

去除模块，用于从视频帧中去除第二视频帧，得到第一视频帧。The removal module is used to remove the second video frame from the video frame to obtain the first video frame.

在一种可能的实施例中，去除模块，包括：In a possible embodiment, the removal module includes:

显示模块，用于在目标视频的时间轴上显示第一标识，第一标识用于指示包括第一目标区域的第三视频帧，第三视频帧包括第二视频帧。The display module is configured to display the first mark on the time axis of the target video, the first mark is used to indicate the third video frame including the first target area, and the third video frame includes the second video frame.

接收模块1010，还用于接收用户对第一标识中的第二标识的第二输入，第二标识与第二视频帧相对应。The receiving module 1010 is further configured to receive a second input from the user on the second identifier in the first identifier, where the second identifier corresponds to the second video frame.

去除模块，具体用于响应于第二输入，从视频帧中去除第二视频帧，得到第一视频帧。The removing module is specifically configured to, in response to the second input, remove the second video frame from the video frame to obtain the first video frame.

显示模块，用于显示包括时间轴的目标视频。The display module is used for displaying the target video including a time axis.

接收模块1010，还用于接收用户对时间轴中的目标时间段的第三输入；目标时间段对应的视频片段包括第二视频帧。The receiving module 1010 is further configured to receive a third input from the user on the target time period in the time axis; the video segment corresponding to the target time period includes the second video frame.

去除模块，具体用于响应于第三输入，从视频帧中去除第二视频帧，得到第一视频帧。The removing module is specifically configured to, in response to the third input, remove the second video frame from the video frame to obtain the first video frame.

在一种可能的实施例中，识别模块，还用于响应于第一输入，识别目标视频中包括第一目标区域的第二视频帧，以及不包括第一目标区域的第四视频帧。In a possible embodiment, the identification module is further configured to, in response to the first input, identify a second video frame including the first target area and a fourth video frame not including the first target area in the target video.

去除模块，还用于从第二视频帧中去除第一目标区域，得到第一视频帧。The removal module is further configured to remove the first target area from the second video frame to obtain the first video frame.

生成模块1020，具体用于对第一视频帧和第四视频帧进行融合处理，生成第一目标图像。The generation module 1020 is specifically configured to perform fusion processing on the first video frame and the fourth video frame to generate the first target image.

在一种可能的实施例中，接收模块1010，还用于接收用户对长曝光图像的第四输入，第四输入用于指示长曝光图像中的第三目标区域。In a possible embodiment, the receiving module 1010 is further configured to receive a fourth input from the user on the long exposure image, where the fourth input is used to indicate the third target area in the long exposure image.

融合模块，还用于响应于第四输入，将第三目标区域与多个视频帧进行融合，得到融合后的目标视频；第一视频帧为融合后的目标视频中的视频帧。The fusion module is further configured to, in response to the fourth input, fuse the third target area with multiple video frames to obtain a fused target video; the first video frame is a video frame in the fused target video.

融合模块，还用于响应于第四输入，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧。The fusion module is further configured to, in response to the fourth input, fuse the third target area with the first video frame to obtain a fused first video frame.

融合模块，还用于对融合后的第一视频帧进行融合处理，生成第一目标图像。The fusion module is further configured to perform fusion processing on the fused first video frame to generate the first target image.

在一种可能的实施例中，融合模块，还用于响应于第四输入，识别目标视频中的目标动态对象。In a possible embodiment, the fusion module is further configured to identify a target dynamic object in the target video in response to the fourth input.

确定模块，还用于根据目标动态对象在视频帧中的位置，确定目标动态对象在目标视频中的移动轨迹。The determination module is further configured to determine the moving track of the target dynamic object in the target video according to the position of the target dynamic object in the video frame.

融合模块，还用于根据移动轨迹，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧。The fusion module is further configured to fuse the third target area with the first video frame according to the movement track to obtain the first video frame after fusion.

在一种可能的实施例中，接收模块1010，还用于在摄像头的取景框显示目标场景的情况下，接收用户的拍摄输入。In a possible embodiment, the receiving module 1010 is further configured to receive a user's shooting input when the target scene is displayed in the viewfinder frame of the camera.

该图像生成装置1000还可以包括：The image generating device 1000 may also include:

拍摄模块，用于响应于拍摄输入，拍摄目标场景的长曝光图像以及目标视频。The photographing module is used for photographing a long-exposure image of a target scene and a target video in response to a photographing input.

综上，本申请实施例提供的图像生成装置，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行合成处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。To sum up, the image generating device provided by the embodiment of the present application generates the first target object by synthesizing the first video frame obtained by removing the target object from the video frame of the target video in response to the user’s first input to the long exposure image. image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing synthesis processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of taking long exposure images and reduces Number of shots.

本申请实施例中的图像生成装置，可以是装置，也可以是终端中的部件、集成电路、或芯片。该装置可以是移动电子设备，也可以为非移动电子设备。示例性的，移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer，UMPC)、上网本或者个人数字助理(personaldigital assistant，PDA)等，非移动电子设备可以为服务器、网络附属存储器(NetworkAttached Storage，NAS)、个人计算机(personal computer，PC)、电视机(television，TV)、柜员机或者自助机等，本申请实施例不作具体限定。The image generating device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device may be a mobile phone, tablet computer, notebook computer, palmtop computer, vehicle electronic device, wearable device, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant) , PDA), etc., the non-mobile electronic device can be a server, a network attached storage (NetworkAttached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., the embodiment of the present application Not specifically limited.

本申请实施例中的图像生成装置可以为具有操作系统的装置。该操作系统可以为安卓(Android)操作系统，可以为ios操作系统，还可以为其他可能的操作系统，本申请实施例不作具体限定。The image generation device in the embodiment of the present application may be a device with an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, which are not specifically limited in this embodiment of the present application.

本申请实施例提供的图像生成装置能够实现图2-图9的方法实施例中图像生成装置实现的各个过程，为避免重复，这里不再赘述。The image generation device provided in the embodiment of the present application can implement various processes implemented by the image generation device in the method embodiments in FIGS. 2-9 . To avoid repetition, details are not repeated here.

可选地，如图11所示，本申请实施例还提供一种电子设备1100，包括处理器1101，存储器1102，存储在存储器1102上并可在处理器1101上运行的程序或指令，该程序或指令被处理器1101执行时实现上述聊天群组的创建方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。Optionally, as shown in FIG. 11 , the embodiment of the present application also provides an electronic device 1100, including a processor 1101, a memory 1102, and a program or instruction stored in the memory 1102 and operable on the processor 1101. The program Or, when the instructions are executed by the processor 1101, each process of the above-mentioned chat group creation method embodiment can be realized, and the same technical effect can be achieved. To avoid repetition, details are not repeated here.

需要注意的是，本申请实施例中的电子设备包括上述的移动电子设备和非移动电子设备。It should be noted that the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.

该电子设备1200包括但不限于：射频单元1201、网络模块1202、音频输出单元1203、输入单元1204、传感器1205、显示单元1206、用户输入单元1207、接口单元1208、存储器1209、以及处理器1210等部件。其中，输入单元1204可以包括图形处理器12041和麦克风12042；显示单元1206可以包括显示面板12061；用户输入单元1207可以包括触控面板12071以及其他输入设备12072；存储器1209可以包括应用程序和操作系统。The electronic device 1200 includes, but is not limited to: a radio frequency unit 1201, a network module 1202, an audio output unit 1203, an input unit 1204, a sensor 1205, a display unit 1206, a user input unit 1207, an interface unit 1208, a memory 1209, and a processor 1210, etc. part. Wherein, the input unit 1204 may include a graphics processor 12041 and a microphone 12042; the display unit 1206 may include a display panel 12061; the user input unit 1207 may include a touch panel 12071 and other input devices 12072; the memory 1209 may include an application program and an operating system.

本领域技术人员可以理解，电子设备1200还可以包括给各个部件供电的电源(比如电池)，电源可以通过电源管理系统与处理器1210逻辑相连，从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。图12中示出的电子设备结构并不构成对电子设备的限定，电子设备可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置，在此不再赘述。Those skilled in the art can understand that the electronic device 1200 can also include a power supply (such as a battery) for supplying power to various components, and the power supply can be logically connected to the processor 1210 through the power management system, so that the management of charging, discharging, and function can be realized through the power management system. Consumption management and other functions. The structure of the electronic device shown in FIG. 12 does not constitute a limitation to the electronic device. The electronic device may include more or fewer components than shown in the figure, or combine some components, or arrange different components, and details will not be repeated here. .

用户输入单元1207，用于接收用户对长曝光图像的第一输入，长曝光图像和目标视频为摄像头对目标场景同时拍摄得到；目标视频包括多个视频帧。The user input unit 1207 is configured to receive the user's first input on the long exposure image. The long exposure image and the target video are obtained by shooting the target scene simultaneously by the camera; the target video includes multiple video frames.

处理器1210，用于响应于第一输入，对目标视频中的第一视频帧进行合成处理，生成第一目标图像。The processor 1210 is configured to, in response to a first input, perform synthesis processing on a first video frame in the target video to generate a first target image.

可选地，处理器1210，还用于识别第一视频帧中的动态对象以及静态对象。Optionally, the processor 1210 is further configured to identify dynamic objects and static objects in the first video frame.

处理器1210，还用于从第一视频帧中提取动态对象的动态信息，以及静态对象的静态信息。The processor 1210 is further configured to extract dynamic information of a dynamic object and static information of a static object from the first video frame.

处理器1210，还用于对多个第一视频帧中的动态信息进行融合处理，生成目标动态信息。The processor 1210 is further configured to perform fusion processing on dynamic information in multiple first video frames to generate target dynamic information.

处理器1210，还用于对静态信息和目标运动信息进行融合处理，生成第一目标图像。The processor 1210 is further configured to perform fusion processing on static information and target motion information to generate a first target image.

可选地，第一输入用于指示第一目标区域，处理器1210，还用于识别目标视频中包括第一目标区域的第二视频帧。Optionally, the first input is used to indicate the first target area, and the processor 1210 is further used to identify a second video frame including the first target area in the target video.

处理器1210，还用于从视频帧中去除第二视频帧，得到第一视频帧。The processor 1210 is further configured to remove the second video frame from the video frame to obtain the first video frame.

显示单元1206，用于在目标视频的时间轴上显示第一标识，第一标识用于指示包括第一目标区域的第三视频帧，第三视频帧包括第二视频帧。The display unit 1206 is configured to display a first mark on the time axis of the target video, where the first mark is used to indicate a third video frame including the first target area, and the third video frame includes the second video frame.

用户输入单元1207，还用于接收用户对第一标识中的第二标识的第二输入，第二标识与第二视频帧相对应。The user input unit 1207 is further configured to receive a second input from the user on the second identifier in the first identifier, where the second identifier corresponds to the second video frame.

处理器1210，还用于响应于第二输入，从视频帧中去除第二视频帧，得到第一视频帧。The processor 1210 is further configured to, in response to the second input, remove the second video frame from the video frame to obtain the first video frame.

可选地，显示单元1206，用于显示包括时间轴的目标视频。Optionally, the display unit 1206 is configured to display the target video including a time axis.

用户输入单元1207，还用于接收用户对时间轴中的目标时间段的第三输入；目标时间段对应的视频片段包括第二视频帧。The user input unit 1207 is further configured to receive a third input from the user on the target time period in the time axis; the video segment corresponding to the target time period includes the second video frame.

处理器1210，还用于响应于第三输入，从视频帧中去除第二视频帧，得到第一视频帧。The processor 1210 is further configured to, in response to the third input, remove the second video frame from the video frame to obtain the first video frame.

可选地，处理器1210，还用于响应于第一输入，识别目标视频中包括第一目标区域的第二视频帧，以及不包括第一目标区域的第四视频帧。Optionally, the processor 1210 is further configured to, in response to the first input, identify a second video frame including the first target area and a fourth video frame not including the first target area in the target video.

处理器1210，还用于从第二视频帧中去除第一目标区域，得到第一视频帧。The processor 1210 is further configured to remove the first target area from the second video frame to obtain the first video frame.

处理器1210，还用于对第一视频帧和第四视频帧进行融合处理，生成第一目标图像。The processor 1210 is further configured to perform fusion processing on the first video frame and the fourth video frame to generate the first target image.

可选地，用户输入单元1207，还用于接收用户对长曝光图像的第四输入，第四输入用于指示长曝光图像中的第三目标区域。Optionally, the user input unit 1207 is further configured to receive a fourth input from the user on the long exposure image, where the fourth input is used to indicate the third target area in the long exposure image.

处理器1210，还用于响应于第四输入，将第三目标区域与多个视频帧进行融合，得到融合后的目标视频；第一视频帧为融合后的目标视频中的视频帧。The processor 1210 is further configured to, in response to the fourth input, fuse the third target area with multiple video frames to obtain a fused target video; the first video frame is a video frame in the fused target video.

可选地，处理器1210，还用于响应于第四输入，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧；Optionally, the processor 1210 is further configured to, in response to a fourth input, fuse the third target area with the first video frame to obtain a fused first video frame;

处理器1210，还用于对融合后的第一视频帧进行融合处理，生成第一目标图像。The processor 1210 is further configured to perform fusion processing on the fused first video frame to generate a first target image.

可选地，处理器1210，还用于响应于第四输入，识别目标视频中的目标动态对象。Optionally, the processor 1210 is further configured to identify a target dynamic object in the target video in response to a fourth input.

处理器1210，还用于根据目标动态对象在视频帧中的位置，确定目标动态对象在目标视频中的移动轨迹。The processor 1210 is further configured to determine the moving track of the target dynamic object in the target video according to the position of the target dynamic object in the video frame.

处理器1210，还用于根据移动轨迹，将第三目标区域与第一视频帧进行融合，得到融合后的第一视频帧。The processor 1210 is further configured to fuse the third target area with the first video frame according to the movement track to obtain the first video frame after fusion.

可选地，用户输入单元1207，还用于在摄像头的取景框显示目标场景的情况下，接收用户的拍摄输入。Optionally, the user input unit 1207 is further configured to receive a user's shooting input when the target scene is displayed in the viewfinder frame of the camera.

处理器1210，还用于响应于拍摄输入，拍摄目标场景的长曝光图像以及目标视频。The processor 1210 is further configured to capture a long-exposure image of a target scene and a target video in response to a shooting input.

本申请实施例中，通过响应于用户对长曝光图像的第一输入，对目标视频的视频帧中去除目标对象得到的第一视频帧进行合成处理，生成第一目标图像。由于长曝光图像和目标视频为摄像头对目标场景同时拍摄得到，所以目标视频中记录了长曝光图像中的动态信息和静态信息。其中，目标对象包括：包括与第一输入相关联的第一目标区域的第二视频帧，或者第二视频帧中的第一目标区域。由此，通过对第一视频帧进行合成处理，能够基于没有第一目标区域的视频帧合成图像，以生成用户期望得到的第一目标图像，提升了拍摄长曝光图像过程中的容错性，减少拍摄次数。In the embodiment of the present application, the first target image is generated by synthesizing the first video frame obtained by removing the target object from the video frames of the target video in response to the user's first input of the long exposure image. Since the long-exposure image and the target video are captured by the camera at the same time of the target scene, the dynamic information and static information in the long-exposure image are recorded in the target video. Wherein, the target object includes: the second video frame including the first target area associated with the first input, or the first target area in the second video frame. Therefore, by performing synthesis processing on the first video frame, an image can be synthesized based on the video frame without the first target area to generate the first target image desired by the user, which improves the error tolerance in the process of taking long exposure images and reduces Number of shots.

应理解的是，本申请实施例中，输入单元1204可以包括图形处理器(GraphicsProcessing Unit，GPU)12041和麦克风12042，图形处理器12041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。显示单元1206可包括显示面板12061，可以采用液晶显示器、有机发光二极管等形式来配置显示面板12061。用户输入单元1207包括触控面板12071以及其他输入设备12072。触控面板12071，也称为触摸屏。触控面板12071可包括触摸检测装置和触摸控制器两个部分。其他输入设备12072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆，在此不再赘述。存储器1209可用于存储软件程序以及各种数据，包括但不限于应用程序和操作系统。处理器1210可集成应用处理器和调制解调处理器，其中，应用处理器主要处理操作系统、用户界面和应用程序等，调制解调处理器主要处理无线通信。可以理解的是，上述调制解调处理器也可以不集成到处理器1210中。It should be understood that, in the embodiment of the present application, the input unit 1204 may include a graphics processor (Graphics Processing Unit, GPU) 12041 and a microphone 12042. Camera) to process the image data of still pictures or videos. The display unit 1206 may include a display panel 12061, and the display panel 12061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1207 includes a touch panel 12071 and other input devices 12072 . Touch panel 12071, also called touch screen. The touch panel 12071 may include two parts, a touch detection device and a touch controller. Other input devices 12072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here. The memory 1209 can be used to store software programs as well as various data, including but not limited to application programs and operating systems. The processor 1210 may integrate an application processor and a modem processor, wherein the application processor mainly processes operating systems, user interfaces, and application programs, and the modem processor mainly processes wireless communications. It can be understood that the foregoing modem processor may not be integrated into the processor 1210 .

本申请实施例还提供一种可读存储介质，所述可读存储介质上存储有程序或指令，该程序或指令被处理器执行时实现上述图像生成方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。The embodiment of the present application also provides a readable storage medium. The readable storage medium stores programs or instructions. When the program or instructions are executed by the processor, the various processes of the above-mentioned image generation method embodiments can be achieved, and the same To avoid repetition, the technical effects will not be repeated here.

其中，所述处理器为上述实施例中所述的电子设备中的处理器。所述可读存储介质，包括计算机可读存储介质，如计算机只读存储器(Read-Only Memory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等。Wherein, the processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes a computer readable storage medium, such as a computer read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.

本申请实施例另提供了一种芯片，所述芯片包括处理器和通信接口，所述通信接口和所述处理器耦合，所述处理器用于运行程序或指令，实现上述图像生成方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。The embodiment of the present application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the above image generation method embodiment Each process can achieve the same technical effect, so in order to avoid repetition, it will not be repeated here.

应理解，本申请实施例提到的芯片还可以称为系统级芯片、系统芯片、芯片系统或片上系统芯片等。It should be understood that the chips mentioned in the embodiments of the present application may also be called system-on-chip, system-on-chip, system-on-a-chip, or system-on-a-chip.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外，需要指出的是，本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能，还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能，例如，可以按不同于所描述的次序来执行所描述的方法，并且还可以添加、省去、或组合各种步骤。另外，参照某些示例所描述的特征可在其他示例中被组合。It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions are performed, for example, the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on such an understanding, the technical solution of the present application can be embodied in the form of a software product in essence or the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD) contains several instructions to enable a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in various embodiments of the present application.

上面结合附图对本申请的实施例进行了描述，但是本申请并不局限于上述的具体实施方式，上述的具体实施方式仅仅是示意性的，而不是限制性的，本领域的普通技术人员在本申请的启示下，在不脱离本申请宗旨和权利要求所保护的范围情况下，还可做出很多形式，均属于本申请的保护之内。The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Under the inspiration of this application, without departing from the purpose of this application and the scope of protection of the claims, many forms can also be made, all of which belong to the protection of this application.

Claims

1. An image generation method, comprising:

receiving a first input of a user to a long exposure image, wherein the long exposure image and a target video are obtained by shooting a target scene by a camera at the same time; the target video comprises a plurality of video frames;

responding to the first input, performing fusion processing on a first video frame in the target video, and generating a first target image;

the first video frame is obtained by removing a target object from the plurality of video frames; the target object includes: a second video frame comprising a first target area, or the first target area in the second video frame;

the first input is used for indicating the first target area, or indicating a second target area in the long-exposure image, and the first target area is at least a partial area except the second target area in the long-exposure image;

the generating a first target image by performing fusion processing on a first video frame in the target video, where the long-exposure image includes dynamic information and static information of the target scene, the dynamic information is used to indicate an image region corresponding to a dynamic object in the target scene, and the static information is used to indicate an image region corresponding to a static object in the target scene, and the generating the first target image includes:

identifying a dynamic object and a static object in the first video frame;

extracting the dynamic information of the dynamic object and the static information of the static object from the first video frame;

performing fusion processing on the dynamic information in the plurality of first video frames to generate target dynamic information;

and performing fusion processing on the static information and the target dynamic information to generate the first target image.

2. The method of claim 1, wherein the first input is indicative of the first target region, and wherein prior to the fusing the first video frame of the target video, the method further comprises:

identifying a second video frame in the target video that includes the first target region;

and removing the second video frame from the video frame to obtain the first video frame.

3. The method of claim 2, wherein said removing the second video frame from the video frames to obtain the first video frame comprises:

displaying a first identifier on a time axis of the target video, wherein the first identifier is used for indicating a third video frame comprising the first target area, and the third video frame comprises the second video frame;

receiving a second input of a second identifier of the first identifiers by the user, wherein the second identifier corresponds to the second video frame;

in response to the second input, removing the second video frame from the video frames to obtain the first video frame.

4. The method of claim 2, wherein said removing the second video frame from the video frames to obtain the first video frame comprises:

displaying the target video including a timeline;

receiving a third input of the user for a target time period in the timeline; the video segment corresponding to the target time period comprises the second video frame;

in response to the third input, removing the second video frame from the video frames resulting in the first video frame.

5. The method according to claim 1, wherein the target object is the first target region in the second video frame, and before the fusing the first video frame in the target video, the method further comprises:

in response to the first input, identifying a second video frame of the target video that includes the first target region, and a fourth video frame that does not include the first target region;

removing the first target area from the second video frame to obtain the first video frame;

the fusing the first video frame in the target video comprises:

and performing fusion processing on the first video frame and the fourth video frame to generate the first target image.

6. The method of claim 1, further comprising:

receiving a fourth input of the long-exposure image by the user, wherein the fourth input is used for indicating a third target area in the long-exposure image;

in response to the fourth input, fusing the third target area with the plurality of video frames to obtain a fused target video; the first video frame is a video frame in the fused target video; or,

in response to the fourth input, fusing the third target area and the first video frame to obtain a fused first video frame; the fusing the first video frame in the target video to generate a first target image includes: and performing fusion processing on the fused first video frame to generate the first target image.

7. The method of claim 6, wherein said fusing the third target region with the first video frame in response to the fourth input to obtain a fused first video frame comprises:

identifying a target dynamic object in the target video in response to the fourth input;

determining the moving track of the target dynamic object in the target video according to the position of the target dynamic object in the video frame;

and fusing the third target area and the first video frame according to the movement track to obtain a fused first video frame.

8. The method of claim 1, wherein prior to said receiving a first user input of a long exposure image, the method further comprises:

receiving a photographing input of the user in a case where a finder frame of a camera displays the target scene;

capturing the long-exposure image of the target scene and the target video in response to the capture input.

9. An image generation apparatus, characterized by comprising:

the receiving module is used for receiving first input of a long exposure image by a user, wherein the long exposure image and a target video are obtained by simultaneously shooting a target scene by a camera; the target video comprises a plurality of video frames;

the generating module is used for responding to the first input, performing fusion processing on a first video frame in the target video and generating a first target image;

the first video frame is obtained by removing a target object from the plurality of video frames; the target object includes: a second video frame comprising a first target region, or the first target region in the second video frame, the first target region being associated with the first input;

the long-exposure image includes dynamic information and static information of the target scene, where the dynamic information is used to indicate an image area corresponding to a dynamic object in the target scene, and the static information is used to indicate an image area corresponding to a static object in the target scene, and the generating module includes:

an identification module for identifying a dynamic object and a static object in the first video frame;

an extraction module for extracting the dynamic information of the dynamic object and the static information of the static object from the first video frame;

the fusion module is used for performing fusion processing on the dynamic information in the plurality of first video frames to generate target dynamic information;

and the fusion module is further used for performing fusion processing on the static information and the target dynamic information to generate the first target image.

10. An electronic device, comprising a processor, a memory and a program stored on the memory and executable on the processor, the program, when executed by the processor, implementing the steps of the image generation method according to any one of claims 1 to 8.