[go: up one dir, main page]

CN120186431A - Video editing method, device, storage medium and program product - Google Patents

Video editing method, device, storage medium and program product Download PDF

Info

Publication number
CN120186431A
CN120186431A CN202510443784.3A CN202510443784A CN120186431A CN 120186431 A CN120186431 A CN 120186431A CN 202510443784 A CN202510443784 A CN 202510443784A CN 120186431 A CN120186431 A CN 120186431A
Authority
CN
China
Prior art keywords
video
template
target
materialization
templates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510443784.3A
Other languages
Chinese (zh)
Inventor
雷鸣生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 58 Information Technology Co Ltd
Original Assignee
Beijing 58 Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing 58 Information Technology Co Ltd filed Critical Beijing 58 Information Technology Co Ltd
Priority to CN202510443784.3A priority Critical patent/CN120186431A/en
Publication of CN120186431A publication Critical patent/CN120186431A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本申请实施例提供一种视频编辑方法、设备、存储介质及程序产品。在本申请实施例中,通过下载用于描述一组视频素材的渲染规则和层级关系的第一视频模板至本地终端设备,并导入视频编辑软件;在视频编辑软件的编辑区域打开第一视频模板,并在视频编辑软件的预览区域显示基于第一视频模板生成的第一视频;进一步,通过响应对第一视频模板的修改操作,确定任一目标素材化模板中被修改的渲染规则和/或任意两个目标素材化模板之间被修改的位置关系,并根据被修改的渲染规则和/或修改的位置关系,对第一视频进行编辑,以得到编辑后的第二视频,能够提升视频生产的效率与多样性,降低对专业设备和人工干预的依赖,满足高效、低成本的多样化视频需求。

The embodiment of the present application provides a video editing method, device, storage medium and program product. In the embodiment of the present application, by downloading a first video template for describing the rendering rules and hierarchical relationship of a set of video materials to a local terminal device and importing it into the video editing software; opening the first video template in the editing area of the video editing software, and displaying the first video generated based on the first video template in the preview area of the video editing software; further, by responding to the modification operation of the first video template, determining the modified rendering rules in any target material template and/or the modified positional relationship between any two target material templates, and editing the first video according to the modified rendering rules and/or the modified positional relationship to obtain the edited second video, the efficiency and diversity of video production can be improved, the dependence on professional equipment and manual intervention can be reduced, and the diversified video needs with high efficiency and low cost can be met.

Description

Video editing method, apparatus, storage medium, and program product
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video editing method, apparatus, storage medium, and program product.
Background
With rapid development of digital media technology and diversification of content consumption demands, video production plays a vital role in numerous fields such as advertisement production, movie creation, online education, short video platform, and the like. For example, in advertising, attractive video advertisements may be created based on product characteristics and target audience, thereby increasing brand awareness and product sales.
In the prior art, in order to meet flexible and diversified video requirements, the video production mode needs to be subjected to multiple complex links such as shooting, later editing, special effect adding and the like, each link needs professional equipment and personnel, a great amount of time and energy are needed to be finished, the video production cost is high, the production efficiency is low, and the video production mode cannot adapt to the gradually increased video requirements.
Disclosure of Invention
The embodiment of the application provides a video editing method, equipment, a storage medium and a program product, which are used for improving the efficiency and diversity of video production, reducing the dependence on professional equipment and manual intervention and meeting the diversified video requirements of high efficiency and low cost.
The embodiment of the application provides a video editing method, which comprises the steps of downloading a first video template to local terminal equipment, wherein the first video template is a video template for rendering a group of video materials to generate a first video, the first video template comprises target material templates which are adaptive to the material types in the group of video materials in various material templates, the various material templates are obtained by changing the initial video template, each material template is used for describing a rendering rule of one video material, the first video template is used for describing the rendering rule and the hierarchical relation of the group of video materials, the hierarchical relation is embodied as the position relation of the target material templates among the first video templates, video editing software on the terminal equipment is operated, the first video template is imported into the video editing software, the first video template is opened in an editing area of the video editing software, the first video which is generated based on the first video template is displayed in a preview area of the video editing software, the modified rule and/or the position relation of the modified video materials among any two target material templates is/are determined in response to the modification operation of the first video template, and the modified position relation of the video materials is/are rendered according to the modified rule and/or the modified position relation of the modified first video materials.
The embodiment of the application also provides electronic equipment, which comprises a processor and a memory, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor can realize each step in the video editing method provided by the embodiment of the application.
The embodiment of the present application also provides a computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to implement the steps in the video editing method provided by the embodiment of the present application.
Embodiments of the present application also provide a computer program product comprising computer programs/instructions which, when executed by a processor, cause the processor to implement the steps of the video editing method provided by the embodiments of the present application.
In the embodiment of the application, a first video template for describing the rendering rule and the hierarchical relation of a group of video materials is downloaded to a local terminal device, the first video template is imported into video editing software running on the terminal device, the first video template is opened in an editing area of the video editing software, a first video generated based on the first video template is displayed in a preview area of the video editing software, and further, the modified rendering rule in any one target materialization template and/or the modified position relation between any two target materialization templates are determined by responding to the modification operation of the first video template, and the first video is edited according to the modified rendering rule and/or the modified position relation, so that an edited second video is obtained. Based on the existing first video template, combining localized video editing software to perform secondary editing on the first video to obtain an edited second video, so that the video production efficiency can be improved, the dependence on professional equipment and manual intervention can be reduced, and the cost can be reduced; and because the first video template is obtained by combining the video materials for generating the video in a personalized combination way, the first video template obtained by combining has a corresponding relation with the first video, the variety of the video template is enriched to a certain extent, and the video requirement for generating the video content diversity is further met.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
Fig. 1 is a flowchart of a video editing method according to an exemplary embodiment of the present application;
Fig. 2 is an interaction schematic diagram of a video batch generation method according to an exemplary embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that, in the case where the embodiment of the present application relates to user information, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) related to the embodiment of the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the relevant laws and regulations and standards of the relevant country and region, and are provided with corresponding operation entries for the user to select authorization or rejection. In addition, the various models (including but not limited to language models or large models) to which the present application relates are compliant with relevant legal and standard regulations.
In addition, it should be noted that, in the case where the embodiment of the present application relates to a user interaction operation or trigger operation, the user interaction operation or trigger operation related to the embodiment of the present application includes, but is not limited to, an interaction operation in various manners, such as a touch operation, a gesture operation, a voice operation, a head movement operation, an eye movement operation, and the like, where the touch operation includes, but is not limited to, a click operation, a double click operation, a long press operation, a sliding operation, a pinch operation, or a mouse hovering operation. The sliding operation includes, but is not limited to, linear sliding, curved sliding, and the like.
Furthermore, it should be noted that, in the case where the embodiment of the present application relates to a jump between a first interface and a second interface, the jump mode related to the embodiment of the present application includes, but is not limited to, directly jumping from the first interface to the second interface, jumping from the first interface to the task interface first, and jumping to the second interface when the task interface completes the corresponding task operation, where the task interface completes the corresponding task operation includes, but is not limited to, completing the game operation on the game interface when the task interface is implemented as a game interface, completing the identity authentication on the identity authentication interface when the task interface is implemented as an identity authentication interface, completing the recharging operation on the recharging interface when the task interface is implemented as a recharging interface, and so on.
In the prior art, in order to meet flexible and diversified video requirements, the video production mode needs to be subjected to multiple complex links such as shooting, later editing, special effect adding and the like, each link needs professional equipment and personnel, a great amount of time and energy are needed to be finished, the video production cost is high, the production efficiency is low, and the video production mode cannot adapt to the gradually increased video requirements.
In order to solve the problems of the prior art, in the embodiment of the application, a first video template for describing the rendering rule and the hierarchical relationship of a group of video materials is downloaded to a local terminal device, the first video template is imported into video editing software running on the terminal device, the first video template is opened in an editing area of the video editing software, a first video generated based on the first video template is displayed in a preview area of the video editing software, and further, the modified rendering rule in any one target materialization template and/or the modified position relationship between any two target materialization templates are determined by responding to the modification operation of the first video template, and the first video is edited according to the modified rendering rule and/or the modified position relationship, so that an edited second video is obtained. Based on the existing first video template, combining localized video editing software to carry out secondary editing on the first video to obtain an edited second video, so that the efficiency and diversity of video production can be improved, the dependence on professional equipment and manual intervention can be reduced, and the cost can be reduced; and because the first video template is obtained by combining the video materials for generating the video in a personalized combination way, the first video template obtained by combining has a corresponding relation with the first video, the variety of the video template is enriched to a certain extent, and the video requirement for generating the video content diversity is further met. The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flowchart of a video editing method according to an exemplary embodiment of the present application. As shown in fig. 1, the method includes:
s11, downloading a first video template to local terminal equipment, wherein the first video template is a video template used for generating a first video by rendering a group of video materials, a target materialization template which is adaptive to the material type in the group of video materials in a plurality of materialization templates, and the variable quantity of the initial video template is obtained by the plurality of materialization templates;
s12, running video editing software on the terminal equipment, importing a first video template into the video editing software, opening the first video template in an editing area of the video editing software, and displaying a first video generated based on the first video template in a preview area of the video editing software;
S13, responding to the modification operation of the first video template, determining a modified rendering rule in any one target materialization template and/or a modified position relation between any two target materialization templates, and editing the first video according to the modified rendering rule and/or the modified position relation to obtain an edited second video.
In the embodiment of the application, the specific form or the deployment mode of the local terminal equipment is not limited, and all equipment which has video processing and editing capabilities and can realize a video editing method can be used as the local terminal equipment in the embodiment of the application. For example, the local terminal device may be a smart phone or a tablet personal computer (such as iOS/Android device) equipped with a mobile operating system, and perform editing operation through touch interaction, or may be a desktop computer or a workstation equipped with a specialty GPU (Graphics Processing Unit), supporting 4K/8K high-resolution video processing, or may be an embedded device (such as a smart television, a vehicle-mounted entertainment system, or an unmanned aerial vehicle control terminal).
In the embodiment of the application, the storage position of the first video template is not limited, and all cloud or edge computing nodes with storage resources can store the first video template in the embodiment of the application. For example, the first video template may be stored in a CDN network (Content Delivery Network, i.e., a content delivery network), and the first video template may be downloaded from the CDN network, so that transmission delay may be reduced, may be stored in a server, and the first video template may be downloaded from the server, may be stored in an edge computing node, and may be combined with a network slicing technique to ensure transmission quality of the first video template.
In the embodiment of the application, the first video template is a video template used for generating a first video by rendering a group of video materials. The first video is a video obtained by rendering a group of video materials according to a rendering rule provided by the first video template. A group of video material is video material that contains multiple material types. The material type refers to the type of different video elements (i.e., video materials) that make up the video content, including but not limited to, audio, video, background graphics, subtitles, digital people, and the like. For example, a set of video material may include a background video, a piece of digital human-spoken audio, a subtitle layer for display, and a background map.
In the embodiment of the application, the first video template comprises a target materialization template which is matched with the material type in a group of video materials in a plurality of materialization templates. The target materialization template is a materialization template which is obtained by combining a plurality of materialization templates corresponding to a plurality of video materials according to a combination logic and is adaptive to the material types of a group of video material. The combining logic comprises a hierarchy relation, which refers to a layer stacking sequence of various video materials in the video generation process, and is used for determining a front-back coverage relation and shielding logic of the various video materials in the video generation process. For example, in generating a video, subtitles may be located at an upper layer opposite to a layer, a background picture may be located at an opposite bottom layer, and a digital person may be superimposed on the upper layer of the background picture and located below the layer of the subtitles.
In addition, the first video template further comprises at least one access link of the video material, wherein the access link refers to a network address or a storage path for locating and acquiring the video material, and the access link is used for acquiring the video material in the CDN network in the rendering process. For example, when the video material type is digital, the access link may point to a digital person stored in the CDN network, such as https:// model-cdn.example.com/avatar_001.Glb.
In the embodiment of the application, the plurality of material templates refer to templates obtained by carrying out variational processing on the initial video template. The initial video template refers to an existing video production framework. The video production frame may be preset or may be derived after video editing by editing software. In either case, the initial video template includes rendering rules for generating the plurality of video materials required for video provision and a hierarchical relationship between the plurality of video materials. In this description, the initial video template does not include video materials, but includes rendering rules corresponding to types of video materials and hierarchical relationships describing display orders and occlusion logic existing between different video materials.
Alternatively, the initial video template may be implemented as an MLT (Media Lovin' Toolkit) template, which is an open source framework for multimedia processing, is an XML format-based file, records all parameters of video editing, such as video clips on a time axis, audio tracks, filter effects, transitions, etc., and may be used for video editing.
In the embodiment of the application, a part of the variable processing is embodied in that a material type is taken as a splitting dimension, and the initial video template is subjected to structural splitting and parameter information templating to obtain a materialized template. Wherein, the materialization template has a corresponding relation with the material type. For example, the material types include audio, video, background images, subtitles, digital persons, and the like. The types of the corresponding materialization templates include, but are not limited to, audio materialization templates, video materialization templates, background picture materialization templates, subtitle materialization templates, digital human materialization templates, and the like.
In the embodiment of the application, each materialization template is used for describing a rendering rule of a video material, the first video template is used for describing a rendering rule and a hierarchical relationship of a group of video materials, and the hierarchical relationship is embodied as a position relationship of a target materialization template between the first video templates. The rendering rule is used for controlling a presentation form of a video material in the generated video, or a rendering result, and the rendering rule can define a presentation form of the video material through combination of rendering parameter information. The rendering parameter information refers to specific parameters related to video material rendering, the rendering parameter information may be used to describe at least one attribute of a material file of a material type, each attribute may be used as a rendering parameter, and an attribute value of each attribute in the initial video template is used as a default parameter value of a corresponding rendering parameter, and the rendering parameter having the default parameter value forms one rendering parameter information corresponding to the material file of the material type. The "material files" refer to various resource files used in the materialization template, and mainly comprise various video materials, such as pictures, videos, audios and the like. The rendering parameter information may be used to control the rendering logic of the material files so that the material files may be rendered in the finally generated video. The rendering parameter information is used for carrying out quantization description on the attributes of the material files so as to enable the video material attributes to be converted into the rendering parameter information attributes, and the rendering effect of the video material in the finally generated video is controlled through the rendering parameter information. In other words, the attribute of the video material is described as a corresponding rendering parameter by the rendering parameter information, and the rendering parameter having a default parameter value defines an attribute value of a certain attribute of the material file.
The rendering parameter information may be extracted from respective corresponding information segments of a plurality of material types, where the rendering parameter information corresponding to each material type includes at least one rendering parameter and a default parameter value corresponding to the rendering parameter, and each rendering parameter is used to describe an attribute of a material file of one material type. In the embodiment of the application, specific attributes corresponding to rendering parameters corresponding to each of a plurality of material types are not limited.
The hierarchical relationship is embodied as the position relationship of the target materialization template between the first video templates. Specifically, the method refers to a layer stacking sequence among different materialization templates in the video generation process. For example, in a video, caption-materialized templates may be set up to be at the upper layer, background-picture-materialized templates at the lower layer, and digital-human-materialized templates at the middle layer. This positional relationship determines the display order and occlusion logic of the multiple video materials in the final video, i.e., the upper layer material will overlay the lower layer material, while the middle layer material may partially occlude the lower layer material while being occluded by the upper layer material. In this way, the hierarchical relationship ensures a rational presentation of the video content, such that the individual elements in the video can be superimposed and interacted with each other in the intended manner.
In the embodiment of the application, the video editing software refers to an application program which can run on the terminal equipment and is provided with a video editing area and a preview area. The video editing area is an interface capable of performing operations such as importing, editing, adjusting and combining video materials or video templates, and the first video template can be imported into video editing software, opened in an editing area of the video editing software, and subjected to modification operations in the video editing area. For example, adjusting the order of video material, modifying rendering rules, changing hierarchical relationships, and so forth. The preview area is a visual interface area for displaying the current editing operation result in real time, and a first video generated based on the first video template can be displayed in the preview area. For example, after the user modifies the parameters in the editing area, the preview area may immediately display the effect of the second video generated based on the updated first video template, so as to ensure that the user can verify the editing result immediately. In the embodiment of the present application, the specific form of the video editing software is not limited. For example, the video editing software can be open source free software such as Shotcut supporting multi-platform base clips, or can be other custom developed specialized editing tools such as enterprise-level video production platforms or industry customization software.
In the embodiment of the present application, the video editing software is implemented as Shotcut as an example, and a specific implementation of editing the first video is described. The method for editing the first video template may relate to a modified rendering rule and/or a modified positional relationship between any two target materialization templates in the target materialization templates according to requirements, and is described in detail below through embodiments.
The method for editing the first video template can be used for carrying out customized adjustment on rendering parameter information related to the modified video material in the target materialization template according to requirements. For example, custom adjustments to rendering parameter information of video material may adjust attribute values of the video material.
In an alternative embodiment, the custom adjustment of the rendering parameter information of the video material may be adjusted to the attribute values of specific attributes of the first video template that relate to the type of video material being modified to optimize picture quality, performance or to adapt to specific requirements. Wherein Shotcut may provide a plurality of candidate parameter values corresponding to attribute values of a plurality of video materials, and may replace one of the candidate parameter values corresponding to attribute values of the related video materials with an attribute value of a specific attribute of the original video material type. For example, for a material whose video material type is video, an attribute value whose attribute is resolution may be adjusted from "1080p" to "720p", or an attribute value whose attribute is brightness may be replaced from "50" to "60".
Alternative embodiment 2 the manner in which the first video template is edited may involve a modified positional relationship between any two modified target materialization templates of the target materialization templates according to the need. The modifying the modified position relationship between any two target materialization templates may be implemented to modify the hierarchical relationship for any two video materials to adjust, or may be implemented to change the playing sequence of any two video materials to adjust.
In an alternative embodiment, the manner in which the first video template is edited may also be adjusted to modify the hierarchical relationship for any two video materials. The adjustment of the hierarchical relationship refers to controlling the covering relationship and the presentation priority of different materials by changing the track position or the stacking speed sequence of the materials in the multi-track time line. In the embodiment of the present application, the adjustment manner of the hierarchical relationship is not limited. For example, the video track can be adjusted by means of a track level through Shotcut, an independent level can be formed by each video track on a Time Axis (Time Axis, also called a timeline, which is a giant deer and presentation tool that connects different video materials in series through Time sequence), track tags can be rearranged through mouse dragging, top track content always covers the lower layer, such as moving a subtitle track up to the video track, so as to ensure that the subtitle is always visible, the video track can also be adjusted by means of a mask and composition through Shotcut, that is, in the Time Axis, the later added video material can cover the area of the previous video material by default, a "lifting level" or a "lowering level" can be selected for the video material, or the priority can be adjusted by directly dragging the material above/below the target track, or the track position can also be adjusted by means of level control of a key frame animation through Shotcut, that is adopted for dynamic effects (such as scaling and rotation) including key frames, and the track position adjustment can ensure interaction of animation effects with other video materials during composition.
In another alternative embodiment, the manner of editing the first video template may further be according to the video material, and the play order of any two video materials may be adjusted, where the play order of the video materials is adjusted by changing the arrangement positions of the video materials in the same track, so as to reorganize the narrative logic or timeline structure of the video. For example, a segment of the pair of voices is shifted before the background audio, i.e. two video segments are adjusted, or the order of play of a plurality of video materials is adjusted.
In the embodiment of the application, the first video is edited based on the modification rendering rule and/or the modified position relation of the first video template, so as to obtain the edited second video. And regenerating videos with different visual effects or structures according to the rendering rules modified by the user on the first video and/or the modified position relation, namely a second video. For example, in the original first video, the subtitle is located at the bottom layer and is blocked by the background image, so that the subtitle is unclear, and the subtitle is moved to the upper layer by modifying the position relationship so that the subtitle is more clearly visible, thereby generating the second video.
In an alternative embodiment, one implementation of determining the modified rendering rule in any target materialization template in response to a modification operation to the first video template includes exposing any target materialization template that needs to be modified in an editing area in response to a positioning operation to the first video template, and obtaining the modified rendering rule in response to a code editing operation in any target materialization template. When a user selects any material type to be modified through interface interaction, the system loads any corresponding target materialization template into the editing area, and the rendering rule of any current target materialization template is visualized. After any corresponding target materialization template is loaded to the editing area, the current rendering rule of the target materialization template can be seen in the editing area. For example, if the video material type is subtitle, the target materialization template may include font size, color, transparency, etc. of the subtitle. The rendering rules may be adjusted by directly modifying the attribute values of the corresponding attributes of the video material in the code or rendering parameter information. For example, the target material template is a background image, a user can directly modify "filter: none" of the target material template in the first video template code into "filter: blur (2 px)", which means that the background image is changed from clear to slightly blurred for highlighting foreground content (such as digital person or subtitle), or the user changes the code value of the transparency of the background image from "0.5" to "0.8" in the editing area, so that the background image is clearer for emphasizing the background image content.
Optionally, another implementation of determining the modified rendering rule in any target materialization template in response to the modification operation on the first video template comprises displaying the modified rendering rule in the first video template in response to the trigger operation for modifying any target materialization template in the first video template in a first editing window, determining the name of any target materialization template and the modified rendering rule in response to the first editing operation in the first editing window, and refreshing the first video template according to the name of any target materialization template and the modified rendering rule in response to the first editing submission operation. Wherein the triggering operation is for a user to initiate an editing action. For example, a double click on the background map or right key on the time axis selects "edit parameter". The first editing window is an interface for modifying the target materialization template in the video editing software, and can comprise an input box for inputting the corresponding attribute or attribute value of the attribute of the video material in the rendering parameter information, a sliding bar or a drop-down menu and the like. For example, an edit window of a target materialization template whose video material type is a background image may display a transparency adjustment slider. The first editing operation is a specific modification operation performed on the target materialization template by the user in the first editing window, so that the name of the modified target materialization template and the modified rendering rule can be determined. For example, dragging the slider changes the transparency from 0.5 to 0.8, determines the name of the target materialization template to be the background map template, and the modified rendering rule to be opacity:0.8. The first edit submitting operation is an operation of confirming modification after the user completes editing. For example, the first edit commit operation may be clicking a "save" or "commit" button. Refreshing refers to a process of updating the rendering parameter information corresponding to the first video template according to the name of the target materialization template and the modified rendering rule, and refreshing the video effect of the preview after the first video is edited in real time.
For example, a user is editing a first video template containing a background picture, a subtitle, and a digital person using video editing software, and the user wishes to adjust the display effect of the subtitle to make it more noticeable. The method comprises the specific operations of selecting a target materialization template of a subtitle from video editing software by a user, detecting the triggering operation by the system, displaying a first editing window containing the name of the subtitle materialization template (such as a subtitle template 1) and a current rendering rule (such as a font color of white and a font size of 24 px) by the system, modifying the rendering rule of the subtitle in the first editing window by the user, changing the font color into yellow, changing the font size into 32px and adding a shadow effect, determining the name of the target materialization template as the subtitle template 1 by the system, recording the modified rendering rule (such as the font color, the size and the shadow effect), clicking a save button by the system to submit modification, refreshing the first video template by the system according to the modified rendering rule, and updating the display effect of the subtitle to obtain a second video. Through the operation, the user successfully adjusts the display effect of the caption, and a second video meeting the requirement is generated. According to the process, by aiming at a single video materialization template and real-time feedback refreshed immediately after submission, efficient and accurate video template modification is realized, a user can intuitively adjust rendering rules through a graphical interface without manually modifying bottom codes, the efficiency and diversity of video production are improved, the dependence on professional equipment and manual intervention is reduced, and the diversified video requirements of high efficiency and low cost are met.
In an alternative embodiment, one implementation method for determining the modified position relationship between any two target materialization templates in response to the modification operation on the first video template includes responding to the dragging operation on any one target materialization template in the first video template, moving any one target materialization template according to the dragging track of the dragging operation, determining another target materialization template according to the position when the dragging operation is terminated, filling any one materialization template into the structural position where the other target materialization template is located, and filling the other target materialization template into the structural position of any one target materialization template to obtain the modified position relationship between any two target materialization templates. The drag operation refers to that a user drags a target materialized template in the editing area. For example, the user drags any object material template through a mouse or touch gesture to adjust the position of the object material template. The system moves the visual position of the target materialization template in real time according to the dragging track, and when the user dragging operation is terminated, the system can determine another selected target materialization template according to the final dragging operation. For example, the user drags the background graphic template in the editing area, and when the user drag operation is terminated, another selected target materialization template, such as a subtitle template, is determined according to the final release position.
The structure bits describe the position of the corresponding target materialization template in the first video template, are placeholders preset in the target materialization template and are used for identifying the position where the target materialization template can be inserted, and each structure bit corresponds to filling of the target materialization template of one material type. The positional relationship between the plurality of structural bits may reflect a hierarchical relationship between the plurality of video materials. The information of the structural bits may include the number or name of the structural bits, which determine the occlusion logic of the target materialization template. For example, the information of the structural bit is layer_1 (background layer) at the bottom layer, the information of the structural bit is layer_2 (middle layer) overlaid thereon, and the information of the structural bit is layer_3 (foreground layer) at the top layer.
After determining any two target materialization templates, filling any one materialization template into the structural position of the other target materialization template, and filling the other target materialization template into the structural position of any one target materialization template, so as to realize the structural position exchange of any two target materialization templates, and obtain the modified position relationship between any two target materialization templates. The position relationship is the hierarchical relationship or track sequence of any two target materialization templates in the video, and the shielding logic of the two target materialization templates in the video can be determined. Finally, the level or track position of any two target materialization templates is reversed, thereby changing the occlusion logic of any two target materialization templates in the finally generated second video. For example, occlusion logic may be embodied with one target materialization template overlaid on top of another target materialization template.
For example, a user is editing a first video template containing a background graphic, a subtitle, and a digital person using video editing software, and the user wishes to place the background graphic over the subtitle. The method comprises the specific operations that a user can drag one selected target materialization template to be a 'background picture template', when the user drags the background picture template to the position of a 'subtitle template' and releases a mouse, the system determines that the other target materialization template is the 'subtitle template', determines that the information of the structural position of the background picture template is a picture layer 1, the information of the structural position of the subtitle template is a picture layer 2, exchanges the information of the structural position of the background picture template with the information of the structural position of the subtitle template is a picture layer 2, and obtains the modified position relation between the two target materialization templates, namely, the information of the new structural position of the background picture template is a picture layer 2, and the new structural position of the subtitle template is a picture layer 1. Finally, in the generated second video, the background picture is overlaid on the subtitle, so that the subtitle is blocked. The process enables the user to intuitively exchange the structural bits of the target materialization template through the visual drag operation, thereby modifying the position relationship of the target materialization template. The interactive mode simplifies the steps of complex level adjustment, so that a user can quickly realize the inversion of the layer sequence without manually editing codes, the efficiency of video production is improved, the dependence on professional equipment and manual intervention is reduced, and the high-efficiency and low-cost video requirements are met.
Optionally, another implementation method for determining the modified position relationship between any two target materialized templates in response to the modification operation on the first video template includes displaying a second editing window in response to the trigger operation for performing position relationship adjustment on any two target materialized templates in the first video template, determining the names of any two target materialized templates and the information of the structure bits after the mutual exchange in response to the modification operation in the second editing window, and refreshing the first video template according to the names of any two target materialized templates and the information of the structure bits after the mutual exchange in response to the modification submitting operation so as to obtain the modified position relationship between any two target materialized templates. And displaying the names of all the target materialization templates and the information of the current structure bit in the second editing window. And, the second editing window contains a menu or check box which can be pulled down, so that the user can select all target materialized templates which can be exchanged. In addition, the second editing window also comprises an input box for a user to add a new target materialization template.
The specific modification operation mode of determining the names of any two target materialization templates and the information of the structure bits after the mutual exchange is not limited for the modification operation in the second editing window. For example, the modification operation mode of the visual mode may be performed by using a menu or check box which is contained in the second editing window and is capable of being pulled down to select all target materialized templates which can be exchanged by the user, and any two target materialized templates are selected so as to determine the names of any two target materialized templates and the information of the structural bits after the exchange, or the modification mode of the manual mode may be performed by using an input box which is contained in the second editing window and is capable of being added with new target materialized templates by the user, and the names of any two target materialized templates and the information of the corresponding new structural bits are input so as to determine the names of any two target materialized templates and the information of the structural bits after the exchange. In the embodiment of the present application, regarding the modification submitting operation, the first video template is refreshed according to the names of any two target materialization templates and the information of the structure bits after the mutual exchange, so as to obtain the modified position relationship between any two target materialization templates, which is described in detail in the foregoing embodiment, and will not be described in detail herein.
For example, the user wishes to swap the subtitle template from layer 2 to layer 1, covering the background map. The user can select the background map template and the subtitle template in the editing interface in a visual mode, and click on the exchange position button. The names of all the target materialization templates and the information of the current structure bit are displayed in a second editing window as follows:
target materialization template Information of current structure bit
Background picture template Layer 1
Subtitle template Layer 2 of the drawings
Digital human template Layer 3 of the drawings
The user selects "subtitle templates" and "background graphic templates" and selects the "swap position" function in the drop-down menu.
Optionally, the user may also manually input the name of the target materialized template a, the subtitle template, the new structure bit, the layer 1, and the name of the target materialized template B, the background image template, the new structure bit, the layer 2. The system determines that the modified target materialization templates are a subtitle template and a background picture template and new structural bit subtitle- & gt picture layer 1 and background picture- & gt picture layer 2.
After the user clicks "submit", the hierarchical relationship in the first video template is updated to generate a second video, i.e., the subtitle template is located at layer 1 (top layer), overlaying the background graphic template (layer 2). The process supports not only the selection of a visual mode but also the manual input through the interactive design of the second editing window, so that a user can flexibly adjust the structural positions of the two target materialization templates, and the position relation of the two target materialization templates is changed. The design gives consideration to intuitiveness and flexibility of operation, is suitable for complex hierarchical management scenes, can accurately update the first video template, and ensures that the modified hierarchical relationship takes effect in real time.
In an alternative embodiment, the first video template is downloaded to the local terminal device, and the method comprises the steps of responding to an access operation to a batch video generation result page, displaying the batch video generation result page, displaying access links of N video instance identifiers on a CDN network or a server side of a target video template corresponding to each of the N video instance identifiers, access links of a group of video materials corresponding to each of the N video instance identifiers on the CDN network or the server side, and access links of videos corresponding to each of the N video instance identifiers on the CDN network or the server side, wherein N is an integer greater than or equal to 2, responding to a triggering operation to the access links of any target video template on the CDN network or the server side, and downloading any target video template as the first video template from the CDN network or the server side to the local terminal device. The batch video generation result page is a page for intensively displaying batch generated videos and related information thereof.
In the embodiment of the application, N video instance identifications are unique identifiers generated according to batch generation tasks, and are different from each other, because each video instance identification can be used for representing one video to be generated, and each video instance identification has a corresponding target video template. Therefore, for the N video instance identifications, access links of the target video templates corresponding to the N video instance identifications on the CDN network or the server may be shown on the batch video generation result page.
The video to be generated has a set of video materials related to the video to be generated, the set of video materials is used for generating the video corresponding to the video instance identifier, and each video instance identifier is convenient to track the generation process of the video to be generated, that is, the video instance identifier can also be used as a unique identity identifier of the video to be generated in the generation process, and corresponds to the generation of one video. Therefore, for the N video instance identifications, access links of a group of video materials corresponding to the N video instance identifications on the CDN network or the server and access links of videos corresponding to the N video instance identifications on the CDN network or the server may be displayed on the batch video generation result page.
The manner of generating the N video instance identifications is not limited herein. For example, including but not limited to numbers and strings of characters, and the like. For example, N integers may be obtained from any integer in an increment sequence formed in a fixed step size as N video instance identifications, or N preset character strings may be used.
In an optional embodiment, for any video instance identifier, in the case that multiple video materials corresponding to the video instance identifier include a target subtitle, a target audio, a green curtain video of a target digital person, and a target background image, access links of the target subtitle, the target audio, the green curtain video of the target digital person, and the target background image on a CDN network or a server side, and access links of the video instance identifier corresponding to the video and a target video template corresponding to the video instance identifier on the CDN network or the server side are respectively displayed on a batch video generation result page.
For example, the user clicks the "view results" button, entering the batch video results page. The page shows three columns of information for 3 video instances:
In an alternative embodiment, as indicated above, the batch video generation results page also includes access links to video material. Based on this, in addition to performing local secondary editing on the first video, local synthesis of a new video may also be performed by downloading a plurality of video materials from a CDN network or a server to a local terminal device.
In an alternative embodiment, in response to triggering operation of access links of a plurality of video materials on a CDN network or a server, downloading the plurality of video materials from the CDN network or the server to a local terminal device, wherein the plurality of video materials are distributed in one or more groups of video materials, video editing software on the terminal device is operated, the plurality of video materials are imported into the video editing software, the plurality of video materials are displayed in an editing area of the video editing software, a third video is generated in response to the editing operation of the plurality of video materials, a third video corresponding to a second video template is generated according to attribute information and hierarchical relation after the editing of the plurality of video materials, and the second video template comprises rendering rules of the plurality of video materials and the position relation of the plurality of video materials in the second video template.
The user can download a plurality of video materials from the CDN network or the server to the local terminal equipment through triggering operation. The downloaded video materials can be distributed in one or more groups of video materials, and a user can select video materials in the same group or different groups to download.
For example, the first set of video material includes:
the second set of video material includes:
Background pictures in the first set of video materials, background pictures (such as churches, gardens, stars and the like) for providing wedding themes and caption templates, preset wedding caption patterns (such as 'Forever & Always' dynamic text effects) are downloaded from the CDN network as a plurality of video materials corresponding to the third video, and background pictures in the first set of video materials, background pictures (such as churches, gardens, stars and the like) for providing wedding themes and audio materials in the second set of video materials, wedding background music (such as piano songs, symphones, popular song clips and caption templates, romantic style caption animations (such as petal-falling caption effects) are downloaded from the CDN network as a plurality of video materials corresponding to the third video. In the embodiment of the application, a user can freely select different groups of video materials for combination according to the creation requirement, the combination is not limited by the grouping of the video materials, the flexibility of video generation is improved, rich (video, audio, background images, subtitles and the like) video material types are provided in a CDN network or a service end, the user can freely match, unique video content is created, the required video material can be efficiently obtained from the CDN network or the service end through the flexible video material downloading and combination function, and personalized creation is performed on local terminal equipment, so that the diversified video production requirement is met.
After the downloading is completed, the video editing software on the terminal equipment is operated, and the selected multiple video materials can be imported by using the video editing software on the terminal equipment and displayed and edited in the editing area. After the user edits the materials, a new video can be generated, that is, a final video file generated according to the multiple video materials after the editing operation is a third video. Meanwhile, according to the edited material attribute information and the hierarchical relationship, the system generates a corresponding second video template. The second video template contains rendering rules for the plurality of video materials and their positional relationship in the second video template so that similar videos can be quickly generated based on the second video template later.
In the embodiment of the present application, regarding the generation of the third video in response to the editing operation on the plurality of video materials, the specific implementation principle of editing the first video to obtain the edited second video is the same as that of the foregoing embodiment in response to the modification operation on the first video template, and will not be described herein.
And generating a second video template corresponding to the third video according to the attribute information and the hierarchical relationship after editing the plurality of video materials. The system automatically records the rendering rule and the position relation of the edited third video to form a reusable second video template for subsequent rapid generation of similar videos.
Scene embodiment-user makes multiple product advertisement videos, each video needs different background and caption, but shares unified animation effect.
First, the user downloads the following video material types from the CDN network:
background group background 1.mp4 (beach), background 2.mp4 (City)
Subtitle group, product A subtitle srt, product B subtitle srt
Animation group, product display animation mp4 (same animation template)
The user imports three video materials into the video editing software, drags "background 1.mp4" to the bottom layer of the time axis, adds "product A subtitle" to the top layer, and sets the animation transparency to 70%. Finally, an advertising video is generated that contains beach background, product a subtitles, and semi-transparent animation. The system saves the edited settings (e.g., background transparency, subtitle position) to form a new template, i.e., a second video template. The process combines the distributed material acquisition and the local editing generation of the video templates, so that efficient and flexible video production is realized, a user can flexibly call different types of video materials from a CDN network, personalized combination is carried out to obtain a second video template, the diversity of video generation is improved, in addition, new videos can be rapidly generated through the local editing and stored as the video templates, the efficiency of video production is improved, the dependence on professional equipment and manual intervention is reduced, and the video requirements of high efficiency and low cost are met.
One implementation of batch video generation based on multiple materialized templates provided by embodiments of the present application is described in detail below.
In an alternative embodiment, a batch video generation task is generated in response to input operations on a video generation page aiming at the number of videos and the categories of the videos, the batch video generation task comprises the number N of videos and the categories of the videos, N video instance identifications are generated according to the batch video generation task, a group of video materials related to the categories of the videos are generated for each video instance identification, multiple material forming templates obtained by carrying out variable processing on the initial video templates are obtained, the initial video templates comprise rendering rules of the multiple video materials required by generating the videos and the hierarchical relation among the multiple video materials, at least one target forming template is determined from the multiple material forming templates according to the types of the materials in the group of the video materials corresponding to the video instance identifications for each video instance identification, the target video templates corresponding to the video instance identifications are obtained based on the hierarchical relation among the multiple video materials included in the initial video templates, the target video templates are used for describing the rendering rules and the hierarchical relation of the video materials of the group, and the N video instance identifications are carried out video generation processing according to the target video templates corresponding to the N video instance identifications and the group of the video materials corresponding to the N video instance identifications.
In the present embodiment, the execution subject for the video batch generation method described above is not limited. For example, the method may be implemented as a service product, which may employ a client-server architecture, where, on the one hand, a video generation page for batch video generation is provided to a user on the client to receive input operations of the user for the number of videos and the video category, and further initiate a batch video generation task to the server. On the other hand, the client initiates a batch video generation task through the video generation service page on the server, and the video is generated in batches through resources such as computing resources, network bandwidth and storage resources of the server, so that the video generation speed is improved.
For another example, as the processing capability of the hardware device corresponding to the client is enhanced, the method may be further executed by the client, where the client may provide a video generation page for the user, and respond to the input operation of the user on the number of videos and the video category on the video generation page to generate a batch video generation task, so as to perform batch video generation through the computing resource and the storage resource of the client. Under the condition that batch generation tasks are deployed on the client side to be executed, data transmission to the server side is not needed, and network delay can be saved.
In this embodiment, the batch video generation task includes the number of videos N, which is an integer of 2 or more, and the categories of videos. Video categories are used to describe the subject matter of expression of video content for a batch of video generation, and the specific implementation of the video categories is not limited. For example, including but not limited to product introduction, teaching instruction, knowledge science popularization, make-up skin care, and the like. Alternatively, the video category may be implemented as a single-level video category, such as for example, as business registration or legal consultation. Alternatively, the video category may also be implemented as a multi-level video category. For example, the primary video category may be a business registry, and the secondary video category of the primary video category may be a clean wash or food handling, etc.
In this embodiment, N video instance identifiers are generated according to a batch generation task, where each N video instance identifier is different from each other, and each video instance identifier may be used to uniquely represent a video to be generated, so as to facilitate tracking of a required video material and a video template of the video to be generated, that is, the video instance identifier may also be used as a unique identity identifier of related content (such as a video material) of the video to be generated.
The manner of generating the N video instance identifications is not limited herein. For example, including but not limited to numbers and strings of characters, and the like. For example, N integers may be obtained from any integer in an increment sequence formed in a fixed step size as N video instance identifications, or N preset character strings may be used.
In this embodiment, a set of video materials associated with a video category is generated for each video instance identifier, the set of video materials being used to generate a video corresponding to the video instance identifier. Wherein each set of video material comprises video material of at least one material type. In some embodiments of the present application, a video material of one material type is simply referred to as a video material.
The material type refers to the type of media resources constituting video, including but not limited to audio, video, background images, subtitles, digital people and the like.
In the present embodiment, the implementation of generating a set of video materials related to a video category for each video instance identification is not limited.
In an alternative embodiment, for any video instance identifier, video material may be randomly extracted from multiple material types stored in the base material library, and the extracted video material may be used as a set of video materials for the video instance identifier. Wherein, the base material library stores a plurality of video materials under a plurality of video categories.
In another optional embodiment, for any video instance identifier, semantic similarity between at least one video material in the base material library and the video category is calculated to obtain a plurality of similarity information, and the video material meeting the similarity condition in the plurality of similarity information is used as a group of video materials of the video instance identifier.
Further, in the present embodiment, a plurality of materialized templates obtained by performing a variable process on an initial video template are acquired.
In the embodiment of the application, the initial video template can be used for making a frame for the existing video. The video production frame may be preset or may be derived after video editing by editing software. In the embodiment of the present application, the specific manner of acquiring the initial video template is not limited. For example, the initial video template may be a preset general video template provided by default by the system, a custom video template created by a user according to requirements, or a video template imported from other external sources.
Wherein the initial video template includes rendering rules for a plurality of video materials required to generate the video, and a hierarchical relationship between the plurality of video materials. In an embodiment of the present application, the rendering rules for each video material are used to describe rendering logic that is followed for such video material during video rendering to achieve the desired rendering effect that such video material exhibits in the generated video. The hierarchical relationship formed between the initial video templates among the multiple video materials refers to a layer stacking sequence of the multiple video materials in the video generation process in the initial video templates, and the hierarchical relationship is used for determining front-back coverage relationship and shielding logic of the multiple video materials in the video generation process.
In this embodiment, the variable processing of the initial video template is partially implemented by taking the material type as a splitting variable, and performing structural splitting and rendering parameter information templating on the initial video template to obtain various materialized templates. Wherein, the materialization template has a corresponding relation with the material type. For example, the material types include audio, video, background images, subtitles, digital persons, and the like. The types of the corresponding materialization templates include, but are not limited to, audio materialization templates, video materialization templates, background picture materialization templates, subtitle materialization templates, digital human materialization templates, and the like. In other words, each material type corresponds to a type of materialization template, and each materialization template is used for describing the rendering rule of the video material.
In the present embodiment, the timing concerning the variable processing is not limited. For example, the initial video template may be subjected to a variational process in advance. For another example, the initial video template may be dynamically subjected to a variable processing. For details on how the variational processing is performed, reference may be made to the following embodiments.
In this embodiment, the materialized templates subjected to the variational processing may be subjected to modular reorganization. For each video instance identifier, at least one target materialization template is determined from a plurality of materialization templates according to the material type in the group of video materials corresponding to the video instance identifier. The material types in the group of video materials have a corresponding relation with the materialization templates.
For example, in the case of the material types of the audio, video, background image and subtitle included in the set of video materials, the target materialization template may include target materialization templates corresponding to the audio, video, background image and subtitle respectively, and in the case of the material types of the audio, video, background image, subtitle and digital person included in the set of video materials, the target materialization template may include target materialization templates corresponding to the audio, video, background image, subtitle and digital person respectively.
Further, under the condition that at least one target materialization template corresponding to a group of video materials is obtained, the at least one target materialization template is combined based on the hierarchical relationship among a plurality of video materials included in the initial video template, so that a target video template corresponding to the video instance identifier is obtained. The target video template is used for describing rendering rules and hierarchical relations of a group of video materials.
The hierarchical relationship among the various video materials included in the initial video template can represent the layer stacking sequence of the various material templates and can be used for organizing and integrating target material templates, so that a target video template with clear hierarchical relationship and rendering rules corresponding to the video materials is formed. The target video template describes the hierarchical relationship of a group of video materials, and the hierarchical relationship of the group of video materials in the initial video template is consistent.
In this embodiment, N video instance identifiers respectively correspond to respective target video templates, that is, each group of video materials has a respective target video template, which enriches the types of target video templates used for batch video generation. And performing video generation processing on the grouping video materials corresponding to the N video instance identifications based on rendering rules described by the N target video templates, so that the style of each video generated in batches is respectively matched with the adopted video templates, and the degree of diversification of the video content generated in batches is improved.
Under the condition that the target video templates corresponding to the video instance identifications are obtained, video generation processing is carried out according to the target video templates corresponding to the N video instance identifications and a group of video materials so as to obtain N videos under the video class. The video generation process refers to a process of filling and rendering video materials in the target video templates based on the target video templates and a group of video materials corresponding to each video instance identifier, so as to obtain a video corresponding to the video instance identifier. For example, for N video instance identifiers, a group of video materials corresponding to each of the N video instance identifiers may be filled into N target video templates to obtain a filled target video template, and then the filled target video template may be rendered to obtain a video under a video class.
In an alternative embodiment, batch video generation processing can be performed on the video generation corresponding to each of the N video instance identifications, M videos are processed in each batch of parallel video generation processing, M is smaller than N and is an integer, and after M videos are generated, M videos after the video generation processing are continued until all the videos corresponding to the N video instance identifications are processed. And through batch video generation processing, the resource utilization rate of the server is improved, and meanwhile, the overload of the pressure of the server caused by high concurrency is avoided.
In the embodiment of the application, in batch video generation, a plurality of material types of materialized templates obtained by carrying out variable processing on an initial video template are utilized to carry out recombination of the materialized templates to obtain video templates required by generating videos, further, a corresponding video instance identifier is bound for each video, a corresponding group of video materials are bound to the video instance identifier, a target materialized template used for recombination is determined from the plurality of materialized templates according to the corresponding group of video materials of the video instance identifier, and the hierarchical relationship among the materialized templates provided by the initial template is combined to organize and integrate the target materialized templates to obtain video templates corresponding to the video instance identifiers respectively, so that the video generation processing of a group of video materials corresponding to each video instance identifier is carried out.
In an alternative embodiment, the target video templates, a group of video materials and videos corresponding to the N video instance identifications may be uploaded to the CDN network or the server for storage. In this way, it is ensured that the target video templates, a group of video materials and videos corresponding to the N video instance identifications can be accessed and distributed quickly, and at the same time, the storage pressure of the local terminal device is reduced. The CDN network can efficiently distribute the N video instance identifiers to the nodes nearby the user, namely the target video templates, a group of video materials and videos corresponding to the N video instance identifiers, so that the access speed and the access stability are improved. The server side can intensively manage the target video templates, a group of video materials and videos corresponding to the N video instance identifications, and support the requirement of large-scale batch video generation. By storing the target video templates, the video materials and the videos in the CDN network or the server, users can acquire the resources through access links at any time to further edit or view, so that the efficiency and the flexibility of video generation are improved.
In the embodiments of the present application, a manner of the variable processing will be described in detail in the subsequent embodiments.
In the embodiment of the application, the initial video template is subjected to variable processing to obtain various materialized templates. When the batch videos are generated, at least one target materialization template is determined from a plurality of materialization templates according to the material type in a group of video materials corresponding to each video instance identifier.
In an alternative embodiment, when determining at least one target materialization template from a plurality of materialization templates according to the material types in a group of video materials corresponding to the video instance identifier, the method comprises the steps of identifying at least one material type contained in the group of video materials corresponding to the video instance identifier, selecting at least one initial materialization template from the plurality of materialization templates according to the at least one material type contained in the group of video materials, wherein each material type corresponds to one initial materialization template, and carrying out parameter adjustment on at least part of the selected at least one initial materialization template to obtain the at least one target materialization template.
In this embodiment, the materialization template included in the plurality of materialization templates obtained by the variable processing of the initial video template is referred to as an initial materialization template. In the case that at least one initial material initialization template corresponding to a certain group is determined from a plurality of initial material initialization templates, a rendering rule described by the initial material initialization template may be referred to as an initial rendering rule. And further, carrying out parameter adjustment on at least part of the initial material initialization templates in the at least one initial material initialization template to obtain at least one target material initialization template. And obtaining a corresponding target material template after parameter adjustment of each initial material template. The parameter-adjusted target materialization template comprises a target rendering rule which is different from the initial rendering rule described by the initial materialization template.
In this embodiment, since the initial material initialization template is obtained through the variable processing, the variable processing can convert the fixed rendering parameters in the initial video template into the variable parameters that can be dynamically assigned, so as to implement flexible configuration of the template content, and the rendering rules are different if the values of the selectable parameters are different. Optionally, each of the initial materialization templates includes at least one variable parameter, and each variable parameter is associated with a plurality of candidate parameter values and a default parameter value.
In this embodiment, parameter adjustment is performed on the initial material initialization template in order to obtain a target material initialization template with different candidate parameter values based on assigning different candidate parameter values to the variable parameters of the initial material initialization template. The target materialization templates aiming at a certain material type can be used for grouped video material rendering of different video instance identifications, and if the parameter values of the variable parameters are different, the rendering rules described by the target materialization templates representing the same material type are different, so that the rendering results of the batch generated videos are different, differentiation is formed among different videos, and the content richness of the videos is improved. How to make parameter adjustments to the initial material initialization template will be described below.
In an alternative embodiment, when at least part of the initial material templates in the at least one initial material template are subjected to parameter adjustment to obtain at least one target material template, the method comprises the steps of determining the number of templates of parameters to be adjusted, wherein the number of templates is smaller than or equal to the number of the at least one initial material template, selecting the initial material template to be adjusted from the at least one initial material template according to the number of the templates to be adjusted, determining variable parameters to be adjusted from the initial material template to be adjusted, randomly determining target parameter values from a plurality of candidate parameter values associated with the variable parameters to be adjusted, and assigning the target parameter values to the variable parameters to be adjusted to obtain the target material template.
In the present embodiment, the manner of determining the number of templates of the parameter to be adjusted is not limited. For example, parameters may be adjusted for each initialized video template, and then the number of templates to be adjusted is the number of at least one initialized material template. For another example, a random integer in a preset range may be generated as the number of templates to be adjusted according to a random number generation algorithm, where the preset range is less than or equal to the number of at least one initial material initialization template. The random number generation algorithm is not limited, and includes, but is not limited to, a linear congruence method (Linear congruential generator, LCG), a Meissn rotation algorithm (MERSENNE TWISTER), and the like.
Further, according to the number of templates to be adjusted, an initial material initialization template to be adjusted is selected from at least one initial material initialization template. In an alternative embodiment, the selection of the initial material initialization template to be adjusted is performed according to the priority of the material type and the number of templates to be adjusted. The priority of the material type refers to the importance degree of the material type on the display effect of the generated video. For example, the subtitle material type may be generally set to a lower priority if it is of lower importance to the presentation effect, the background map may be set to a medium priority if it is of higher priority than the subtitle, and the digital person may be set to a higher priority if it is of higher importance to the presentation effect. Furthermore, the templates with parameters to be adjusted can be selected from the initial material templates with higher priority of the material types according to the priority, if the number of the initial material templates is smaller than the number of the templates to be adjusted determined previously, the templates with lower priority of the material types can be selected from the initial material initialization templates with lower priority of the material types, and the number of the finally selected templates to be adjusted is smaller than or equal to the number of at least one initial material initialization template.
Further, from the initial material initialization template to be adjusted, the variable parameters to be adjusted are determined. In an alternative embodiment, all the variable parameters in the initial material initialization template to be adjusted may be used as the variable parameters to be adjusted. In another alternative embodiment, the target variable parameter in the adjusted initial material initialization template is taken as the optional parameter to be adjusted, and the target variable parameter is a preselected optional parameter.
Further, a target parameter value is randomly determined from a plurality of candidate parameter values associated with the variable parameter to be adjusted. In some embodiments, in at least one initial material initialization template corresponding to each of the N video instance identifiers, the variable parameters to be adjusted are the same, and multiple candidate parameter values associated with the variable parameters to be adjusted may be randomly used as respective target parameter values of each of the N video instance identifiers, so that the values of optional parameters of the N video instance identifiers are as different as possible.
Further, the target parameter value is assigned to the variable parameter to be adjusted to obtain the target materialized template.
And under the condition that the target materialization template is obtained, combining the target materialization templates to obtain a target video template. The present embodiment is not limited in terms of the manner of combination. The following provides two combinations, but is not limited thereto.
In an alternative embodiment, a base video template is generated according to a hierarchical relationship between a plurality of video materials included in the initial video template, and the base video template is used as a frame of a target video template and comprises a plurality of blank structural bits corresponding to the plurality of video materials. The blank structure bits refer to placeholders of target materialization templates preset in the basic video templates and are used for identifying positions where the target materialization templates can be inserted, and each blank structure bit corresponds to filling of the target materialization templates of one material type. The positional relationship between the plurality of blank structure bits reflects the hierarchical relationship between the plurality of video materials, which, as described in the above-described embodiment, represents the order of superimposition of the plurality of materials in generating the video. In some embodiments, the hierarchical relationship is extracted from the initial video template, or may be preset based on at least one target materialization template, that is, the hierarchical relationship may be set as desired. For example, the structure corresponding to the caption is located at the upper layer, the structure corresponding to the digital person is located at the middle layer, and the structure corresponding to the background picture is located at the bottom layer. Further, at least one target materialization template is respectively inserted into the corresponding blank structure position in the basic video template, so that a target video template corresponding to the video instance identifier is obtained.
In another alternative embodiment, according to at least one target materialization template, the structural bits of the video material of the same material type in the initial video template where the rendering rule is located are covered, so as to obtain a target video template corresponding to the video instance identifier. The difference between the structure bit and the blank structure bit is that the rendering rule of each video material in the initial video template occupies one structure bit, and the blank structure bit is empty. Wherein the positional relationship between the structural bits reflects a hierarchical relationship between the plurality of video materials.
And under the condition that the target video template is obtained, carrying out video generation processing according to the target video template corresponding to each video instance identifier and a group of video materials. In an alternative embodiment, when video generation processing is performed on each corresponding target video template and a group of video materials according to N video instance identifications to obtain N videos under the video category, the method comprises the steps of respectively filling the group of video materials corresponding to each video instance identification into the target materialization templates in the target video templates corresponding to the video instance identifications, and rendering the group of video materials according to rendering rules and hierarchical relations of the group of video materials described by the filled target video templates for the filled target video templates to obtain one video under the video category.
The target materialization template comprises at least one placeholder corresponding to the material type and is used for filling the video material of the material type.
In an alternative embodiment, generating a group of video materials related to the video category for each video instance identifier comprises the steps of acquiring a group of video material description information related to the video category for each video instance identifier, wherein each group of video material description information comprises description information of a plurality of video materials, calling a plurality of material generation models based on artificial intelligence according to the group of video material description information corresponding to the video instance identifier for each video instance identifier, generating a plurality of video materials corresponding to the video instance identifier, and synchronously uploading the plurality of video materials corresponding to the video instance identifier to a content distribution network, and correspondingly, before video generation processing is carried out according to a target video template and a group of video materials corresponding to each of N video instance identifiers to obtain N videos under the video category, responding to a batch video generation trigger event, and respectively acquiring a plurality of groups of video materials corresponding to the N video instance identifiers from the content distribution network according to the N video instance identifiers. The video material is managed through the content distribution network, so that the storage pressure of the server side is reduced, a large amount of videos can be efficiently rendered by the server side, and the user experience is improved.
In this embodiment, the description information of each group of video materials is used to generate video materials corresponding to the video instance identifier. Wherein each group of video material contains video material of a plurality of material types. The material types refer to the types of different video elements that make up the video content, including but not limited to audio, video, background graphics, subtitles, digital people, and the like.
In the present embodiment, the implementation of generating the description information of the group of video materials related to the video category for each video instance identification is not limited.
In an alternative embodiment, for any video instance identifier, the description information of the video material may be randomly extracted from multiple material types stored in the base material library, and the description information extracted to the multiple video materials is used as the description information of a group of video materials of any video instance identifier. The base material library stores description information of various video materials under various video categories.
In another optional embodiment, for any video instance identifier, semantic similarity between description information of multiple video materials in a base material library and video category is calculated to obtain multiple similarity information, and the description information of the video materials meeting the similarity condition in the multiple similarity information is used as description information of a group of video materials of the video instance identifier.
In yet another alternative embodiment, for any video instance identifier, a material description information generation model is invoked from a video category, the model being used to generate description information for a plurality of video materials associated with the video category. The material description information generation model is obtained by training description information of a large number of sample video materials with different sample video categories and different material types, and can be used for generating description information of various video materials related to the sample video materials with different sample video categories in a targeted mode by learning semantic relativity between the description information of the sample video materials with different sample video categories and the description information of the sample video materials with different material types.
Further, for each video instance identifier, according to a group of video material description information corresponding to the video instance identifier, invoking a plurality of material generation models based on artificial intelligence, generating a plurality of video materials corresponding to the video instance identifier, and synchronously uploading the plurality of video materials corresponding to the video instance identifier to a content distribution network.
Wherein one material generation model can generate at least part of a plurality of video materials. For example, a video material may be generated by a material generation model. This will be described below by way of example, but is not limited thereto.
In this embodiment, in the case of generating video materials of a video instance identifier, a target video template corresponding to each video instance identifier is determined according to a plurality of video materials corresponding to each video instance identifier, and each target video template corresponding to each video instance identifier is used to describe rendering rules and hierarchical relationships of the plurality of video materials corresponding to the video instance identifier. The determination manner of each video instance identifier corresponding to the target video template may refer to the above embodiment, and will not be described herein.
In the embodiment, in response to a batch video generation trigger event, according to N video instance identifications, multiple video materials corresponding to the N video instance identifications are respectively obtained from a content distribution network, and video generation is performed according to the multiple video materials corresponding to the N video instance identifications and a target video template, so as to obtain N videos under a video category.
In this embodiment, the specific implementation of the trigger event for batch video generation is not limited, and may be flexibly configured according to the actual application requirement. For example, the method may be that all the multiple materials corresponding to the N video instance identifications are generated, or each video instance identification is generated, where video generation is performed on multiple video materials corresponding to the video instance identification, or the batch video generation triggering event may be a preset triggering time, for example, after a period of time in response to an input operation on a video generation page for the number of videos and the category of videos, for example, 2 hours, 1 day, 1 week, etc., and the time span is not limited.
It should be noted that, each video instance identifier corresponds to one video generation, N video instance identifiers may correspond to N video generation, and when video generation is performed on multiple video materials corresponding to each of the N video instance identifiers, the video generation process corresponding to each video instance identifier is asynchronous, and the video generation corresponding to each video instance identifier does not affect each other, so as to improve the generation efficiency of the N videos.
Further optionally, the descriptive information for each group of video material includes, but is not limited to, subtitle descriptive information, audio type descriptive information, digital person descriptive information, and background picture descriptive information. As shown in FIG. 2, in a group of video material description information corresponding to a video instance identifier, a plurality of material generation models based on artificial intelligence are called to generate a plurality of video materials corresponding to the video instance identifier, including generating text information according to caption description information, calling a generating language model to generate a target caption, calling a text conversion voice model according to the target caption and audio type description information to convert the target caption into target audio matched with the audio type description information, calling a multi-mode model according to the target audio and the digital person description information, selecting a target digital person according to the digital person description information, generating a green curtain video of the target digital person based on the target audio, and calling a text-to-image model to generate a background image according to background image description information to obtain a target background image.
In this embodiment, APIs (Application Programming Interface, application programming interfaces) of the multiple material generation model are associated with endpoints. In this embodiment, the video generation page is a presentation of the front end code of the endpoint. The endpoints are used for generating video materials, managing batch video generating tasks and controlling video generating flows.
Wherein the endpoint comprises front-end code and back-end code, i.e. a client-server architecture is employed as described in the above embodiments. The front-end code is a video generation page built based on a front-end framework, and the video generation page runs on a client and is used for interacting with a user, receiving input operation of the user and initiating a batch video generation task to a server. The back-end code of the endpoint is executed on the server, and the back-end code is obtained by API based on the code of the existing video editing software, such as shortcut. In the existing video editing software, the front end UI code is highly coupled with the video rendering function code. In this embodiment, the segmentation is performed according to the function, so as to decouple the front end UI of the video editing software from the video rendering function code, and further encapsulate the video rendering function code of the existing video editing software into an independent API, so that the video rendering function can be called by the external device through a standard API mode, and automatic video generation is realized.
As with the endpoint in fig. 2, in an example, the front end code of the endpoint may be a video generation page built based on Astro, and is responsible for receiving callback notification, such as a notification may be sent to the endpoint when the material generation model process is complete, to notify that subsequent flows of video generation may continue. The back-end code of the endpoint can be obtained by API based on the video rendering code of the video editing software, so as to be exposed outwards in a mode of API interface for external calling.
In an alternative embodiment, the method comprises the steps of calling a generating language model to generate text information according to caption description information to obtain target captions, calling a pre-designed prompting word template according to the matching of keywords of the video category, filling the keywords of the video category into the prompting word template to obtain prompting words of the video category, and inputting the prompting words of the video category into the generating language model to obtain the target captions related to the video category.
The keywords are used for describing the subject content expressed by the video categories, different video categories can correspond to different keywords, and prompt word templates corresponding to different video types can also be different. Wherein the target subtitle includes all text contents required for each video. By calling the generated language model, the target subtitle can be automatically generated based on the category of the video, manual intervention is not needed, and the generation efficiency is improved.
Further optionally, according to the target subtitle and the audio type description information, a text conversion voice model is called to convert the target subtitle into target audio adapted to the audio type description information. Wherein the target subtitle is used to provide text content and the audio type description information is used to specify an audio type of the generated target audio, including, but not limited to, audio format, audio language, audio intonation, audio quality, and the like. Any audio type that can be used to specify the sound effect of the target audio is applicable to the present embodiment. Based on the target subtitle and the audio description information, calling a text conversion voice model to correspond to the target audio, and uploading the target audio to a content distribution network, so that the automation degree of audio generation is improved.
Further, the multimodal model is invoked to generate a green curtain video of the digital person. The digital person (Virtual Human) refers to a Virtual person generated based on an AI technology, can synchronously simulate the appearance, voice, mouth shape and other behaviors of a real person, and can be used for intelligent customer service, short video production, virtual anchor and other scenes, but is not limited to the above. In this embodiment, a plurality of video materials corresponding to each video instance identifier are combined, so that a video with a real person speaking effect can be generated.
In an alternative embodiment, the digital person may be a live video or a picture recorded in advance, and in a subsequent embodiment, the live video and the picture are collectively referred to as video frames, and the number of video frames may be one or more. In this case, the description information for different digital persons may be implemented as identification information, which is a unique identification of the digital person, for acquiring a video frame of the digital person corresponding to the identification information. In another alternative embodiment, the video frames of the digital person may be dynamically generated based on a multimodal model, in which case the description information of the digital person may be a prompt for generating the digital person, and the description information of the digital person corresponding to each video instance identification may be different.
Further optionally, when the multi-modal model is invoked according to the target audio and the digital person description information, the target digital person is selected according to the digital person description information, and the generation of the green curtain video is performed based on the target audio and the target digital person to obtain the green curtain video of the target digital person, the method comprises the steps of acquiring a corresponding target digital person video frame based on the digital person description information; according to a video frame and target audio of a target digital person, a multi-mode model is called to conduct multi-dimensional feature extraction on the target audio to obtain multi-dimensional voice features, wherein the multi-dimensional voice features comprise but are not limited to voice content features and voice emotion features, mouth shape control parameters of the target digital person are determined according to the voice content features, facial expression control parameters of the target digital person are determined according to the voice emotion features, limb motion control parameters of the target digital person are determined according to the voice content features and the voice emotion features, green curtain video of the target digital person is generated based on the mouth shape control parameters, the expression control parameters and the limb motion control parameters of the target digital person, and mouth shape, facial expression and limb motion of the green curtain video of the target digital person are matched with the target audio.
In this embodiment, the voice content features are used to reflect semantic information in the target audio, such as vocabulary content, grammar structure, voice intention and voice rhythm, and are mainly used to drive the mouth shape of the digital person to be semantically synchronized with the target audio. The voice emotion characteristics are used for reflecting the emotion state of the target audio, including but not limited to the strength of the voice, the speed of the voice, the intonation change and the like, and are mainly used for driving the facial expression and limb actions of the digital person.
Further, according to the voice content characteristics, determining the mouth shape control parameters of the target digital person, according to the voice emotion characteristics, determining the facial expression control parameters of the target digital person, and according to the voice content characteristics and the voice emotion characteristics, determining the limb motion control parameters of the target digital person.
Under the condition that the mouth shape control parameters, the facial expression control parameters and the limb motion control parameters are obtained, driving the target digital person to conduct motion rendering based on the mouth shape control parameters, the expression control parameters and the limb motion control parameters of the target digital person, and generating a corresponding green curtain video of the target digital person, so that the background image can be conveniently and flexibly replaced. Because the green curtain video of the target digital person is generated based on the multidimensional voice characteristic control extracted from the target audio content, the dynamic expression of the mouth shape, the facial expression and the limb action of the target digital person in the green curtain video are ensured to be matched with the target audio in terms of semantics, time sequence and emotion, the accurate alignment of the mouth shape is ensured, and the sense of reality and the sense of sight experience of the generated video are improved.
Further, in this embodiment, the alignment process is performed on the target audio and the target subtitle, for example, the timestamp reasoning and the alignment labeling are performed on the target subtitle through the timestamp of the voice content in the target audio, so as to ensure that the subsequent target subtitle presentation is completely synchronous with the target audio.
In this embodiment, one way to generate the background map is to call the text-generated map model to generate the background map according to the background map description information, so as to obtain the target background map. The other is to directly acquire a background image generated or photographed in advance, which is not limited.
In an alternative embodiment, a corresponding set of video material generation states is maintained for N video instance identifications, respectively, the generation states of video materials of any material type in each set of video materials comprising a ready-to-generate state, an in-generate state, a generate success state and a generate failure state. For example, for any video instance identifier, the video instance identifier includes a target subtitle, a target audio, a target digital person's green screen video, and a target background map for a plurality of video materials that need to be generated. The respective generation states may be maintained for the target subtitle, the target audio, the target digital person's green curtain video, and the target background map, respectively.
The method comprises the steps of marking a plurality of video materials corresponding to each video instance identifier to be in a ready-to-generate state, updating the video materials which are required to be generated by any material generation model to be in a generating state when a plurality of material generation models based on artificial intelligence are called to generate a plurality of video materials corresponding to the video instance identifier, if any material generation model returns a generating response message, updating the video materials which are required to be generated by the material generation model to be in a generating state, if any material generation model returns a generating success message, updating the video materials which are required to be generated by the material generation model to be in a generating success state, and if any material generation model returns a generating failure message, updating the video materials which are required to be generated by the material generation model to be in a generating failure state. As shown in fig. 2, a subscription service is provided that can generate a notification of a model for each material to notify the subscription service of the generation status of the video material of the material model, which can be, for example, a generation success status. Furthermore, the subscription service returns a successful generation message to the server to inform the server that the corresponding video material has been successfully generated. The notification process of the target digital person is only taken as an example in fig. 2, but is not limited thereto.
Further optionally, if the generation status of the video material is updated to be in a generation failure status, outputting a failure reminding message to the user initiating the input operation, where the failure reminding message includes the material type of the video material updated to be in the generation failure status and the corresponding video instance identifier. And if a regeneration operation triggered by a user is received, acquiring the description information of the corresponding video material according to the video instance identifier corresponding to the video material, and calling a corresponding artificial intelligence-based material generation model to regenerate the corresponding video material.
In this embodiment, in the case of generating multiple video materials corresponding to each video instance identifier, multiple video materials corresponding to each video instance identifier may be synchronously uploaded to the content distribution network, and access links of the video materials corresponding to each video instance identifier in the content distribution network may be acquired. For any video instance identifier, in the case that the video instance identifier corresponds to a plurality of video materials including the target subtitle, the target audio, the target digital person's green curtain video and the target background image, uploading the target subtitle, the target audio, the target digital person's green curtain video and the target background image to a content distribution network, and obtaining access links of the target subtitle, the target audio, the target digital person's green curtain video and the target background image.
Further, responding to a batch video generation triggering event, respectively acquiring a plurality of video materials corresponding to the N video instance identifications according to access links of the plurality of video materials corresponding to the N video instance identifications in a content distribution network, and generating videos according to the plurality of video materials corresponding to the N video instance identifications and a target video template so as to obtain N videos under video categories.
In the above example, in response to a batch video generation trigger event, according to access links of the N video instance identifications corresponding to the target subtitles, the target audio, the green curtain video of the target digital person and the target background image in the content distribution network, respectively obtaining the target subtitles, the target audio, the green curtain video of the target digital person and the target background image corresponding to the N video instance identifications; and generating videos according to the N video instance identifications respectively corresponding to the target subtitle, the target audio, the green curtain video of the target digital person, the target background image and the target video template to obtain N videos under the video category.
Further optionally, uploading videos and target video templates corresponding to the N video instance identifications to a content distribution network, acquiring access links of the videos and target video templates corresponding to the N video instance identifications in the content distribution network, adding the access links to a video generation result page, and displaying the video generation result page in response to the viewing operation of the video result page, wherein the video generation result page comprises a group of video materials, videos and access links of the target video templates in the content distribution network, corresponding to at least one video instance identification in the N video instance identifications.
In this embodiment, each video instance identifier stores respective video materials, videos and target video templates in the content distribution network, so that a server does not need to store a large number of files, the disk I/O load is reduced, and the service stability is ensured. Further, each video instance is identified with respective video material, video and target video templates, and access links stored in the content distribution network can avoid data expansion, reduce query pressure, and support larger-scale data management.
In this alternative embodiment, the set of video material, video and/or target video templates corresponding to the at least one video instance identification information is accessed in response to a triggering operation of an access link in the content distribution network for the set of video material, video and/or target video templates corresponding to the at least one video instance identification information. Through CDN host video and video templates, the video generation result page can directly load video from the content distribution network, and compared with pulling video from a server, the video generation result page has the advantages of less bandwidth occupation and higher loading speed, can remarkably improve the performance, and is suitable for large-scale video browsing scenes.
The manner of the variational processing is described below.
In an alternative embodiment, the material type is used as a splitting variable, the initial video template is analyzed to obtain information fragments corresponding to multiple material types, rendering parameter information corresponding to the multiple material types is respectively extracted from the information fragments corresponding to the multiple material types, and the rendering parameter information corresponding to the multiple material types is templated to obtain multiple materialized templates.
Further, after the initial video template is obtained, the material type is used as a splitting variable, and the initial video template is analyzed. The material types refer to different types of video materials and are used for distinguishing classification standards of different video materials, and the splitting variables refer to independent information fragments obtained by dividing the initial video template according to the material types when the initial video template is analyzed. For example, the initial video template comprises two video materials of image and audio, the material type is used as a classification standard, the initial video template is split into independent information fragments of image and audio, and the independent information fragments are respectively processed later.
The information segment is extracted from the initial video template, and corresponds to the material type, and the information segment may contain rendering parameter information corresponding to the material type. For example, one piece of information may contain rendering parameter information of which material type is path information, play time, special effects application, and the like.
In the embodiment of the present application, the rendering parameter information is used to describe at least one attribute of a material file of a material type, and each attribute may be used as a rendering parameter, and an attribute value of each attribute in the initial video template is used as a default parameter value of a corresponding rendering parameter, where the rendering parameter having the default parameter value forms one rendering parameter information corresponding to the material file of the material type. The "material files" refer to various resource files used in the materialization template, and mainly comprise various video materials, such as pictures, videos, audios and the like. The rendering parameter information may be used to control the rendering logic of the material files so that the material files may be rendered in the finally generated video. The rendering parameter information is used for carrying out quantization description on the attributes of the material files so as to enable the video material attributes to be converted into the rendering parameter information attributes, and the rendering effect of the video material in the finally generated video is controlled through the rendering parameter information. In other words, the attribute of the video material is described as a corresponding rendering parameter by the rendering parameter information, and the rendering parameter having a default parameter value defines an attribute value of a certain attribute of the material file.
The rendering parameter information may be extracted from respective corresponding information segments of a plurality of material types, where the rendering parameter information corresponding to each material type includes at least one rendering parameter and a default parameter value corresponding to the rendering parameter, and each rendering parameter is used to describe an attribute of a material file of one material type. In the embodiment of the application, specific attributes corresponding to rendering parameters corresponding to each of a plurality of material types are not limited.
In the embodiment of the application, the templating can convert the rendering parameter information corresponding to each of a plurality of material types from the rendering parameter of the default parameter value of the fixed configuration to the variable parameter set of the parameter value which can be replaced dynamically. The method comprises the steps of carrying out templatization on rendering parameter information corresponding to each of a plurality of material types, and obtaining a plurality of materialization templates corresponding to each of the plurality of material types, wherein each materialization template is used for describing a rendering rule of one material type, the rendering rule comprises at least one variable parameter, and each variable parameter is associated with a plurality of candidate parameter values so as to control the rendering effect of video generation.
In the embodiment of the application, the material template is a result of performing template processing on rendering parameter information corresponding to each of a plurality of material types. It includes a path information placeholder for the material file and a variable parameter and its associated plurality of candidate parameter values. The materialization template describes rendering rules of material files of corresponding material types in the generated video, the rendering rules describing rendering logic to be followed for such video material in the video rendering process for controlling the rendering of the intended rendering effect of such material files in the generated video. The rendering parameter information is converted from the rendering parameters with default parameter values in a fixed configuration to a material template formed by a dynamically replaceable variable parameter set, and different candidate parameter values can be given to the variable parameters so as to flexibly adjust the rendering effect of the material file in different scenes for generating the video. The rendering parameter information is templated to form a plurality of materialized templates which can be recombined in a modularized mode, different target video templates can be obtained by recombining the materialized templates aiming at different material types, and diversified videos can be generated in batches without designing and adjusting independent video templates of each video, so that time and manpower resources are saved, the flexibility of generating the video templates is improved, the content diversity is improved, and further the efficient batch generation of diversified style videos is realized.
For example, for the same materialization template, by giving different candidate parameter values to the variable parameters, attribute values corresponding to various attributes of the material type corresponding to the materialization template can be flexibly set, so that the definition control of the rendering rule of the material file is realized. The materialization template improves the flexibility of video generation, can obviously improve the efficiency and quality of video generation, and simultaneously meets diversified creation requirements.
In the embodiment of the application, the method is not limited to the specific content of the candidate parameter values associated with the variable parameters obtained by templating the rendering parameter information corresponding to each of the plurality of material types.
The variable parameters can be flexibly adjusted according to requirements by associating a plurality of candidate parameter values for the variable parameters, so that the flexibility and the diversity of the generation of the video templates are improved, and further, the efficient batch generation of diversified style videos is realized.
In an alternative embodiment, the method includes the steps of analyzing an initial video template by taking a material type as a split variable to obtain information fragments corresponding to multiple material types, wherein the method comprises the steps of loading an XML document corresponding to the initial video template, the XML document comprises a root element and multiple non-root elements connected with the root element, the multiple non-root elements comprise multiple specific elements, each specific element is used for describing a rendering rule of one material type, traversing the non-root elements in the XML document from the root element to identify the multiple specific elements, and extracting the multiple information fragments where the multiple specific elements are located to serve as the information fragments corresponding to the multiple material types. The XML document of the initial video template can be loaded into the memory to form a resolvable tree structure. In the embodiment of the application, the XML document can be loaded into the memory through the DOM (Document Object Model) parser to form a parsing mode of the tree structure.
Wherein the tree structure includes a root element and a non-root element. The root element is the top node of the XML document, and non-root elements are nested directly or indirectly under the root element. An XML document has and has only one root element, which is the first element of the XML document and can serve as the starting point for the XML document.
The non-root elements are sub-elements directly or indirectly nested under the root elements and are used for dividing different information fragments of the initial video template according to the material types. The non-root element comprises a plurality of specific elements, and the specific elements can be identified by traversing the non-root element in the XML document from the root element. The specific elements are elements of rendering rules that can directly describe a certain class of material types, and each specific element corresponds to one material type.
By extracting these pieces of information, the system can quickly recognize the attributes of the material files and render them according to their attribute values. The design of the extracted information fragment obviously improves the flexibility and diversity of video generation, meets the requirement of diversified creation, and simultaneously provides a technical foundation for efficient batch generation of videos.
In an alternative embodiment, traversing non-root elements in an XML document from a root element to identify a plurality of specific elements, including S1 traversing the non-root elements in the XML document from the root element, S2 acquiring an element tag contained in the currently traversed non-root element for the currently traversed non-root element, S3 judging whether the currently traversed non-root element contains a sub-element if the element tag is a specific tag, S4 returning the sub-element as the currently traversed non-root element to execute step S2 if the currently traversed non-root element contains the sub-element, S5 returning the next non-root element if the element tag is a non-specific tag, and executing step S2 if the currently traversed non-root element does not contain the sub-element, and taking the currently traversed non-root element as a specific element.
Wherein, in step S1, from the root element of the XML document, non-root elements are accessed one by one for traversal. In the embodiment of the present application, the specific implementation strategy of traversal is not limited. For example, traversal may be implemented using a depth-first search algorithm, or may be implemented using a breadth-first search algorithm. In the embodiment of the application, taking a depth-first search algorithm as an example, a process of traversing non-root elements in an XML document to identify a plurality of specific elements is described in detail.
Next, step S2 is executed to obtain, for the currently traversed non-root element, an element tag contained in the currently traversed non-root element. Wherein, in XML document, element labels are identifiers in brackets (< >) for marking the type and semantic meaning of the element. Different material types are distinguished through names of the element labels, and specific positions of the materials are positioned through hierarchical relations of the element labels.
After the element label contained in the currently traversed non-root element is obtained, judging whether the element label is a specific label or not. The specific tag may be a predefined set of key tags, representing element tags that need special processing in the XML document, for identifying the type of material that needs to be extracted.
Then, step S3 or S5 is performed. If step S5 is executed, that is, the element tag is a non-specific tag, the next non-root element is continued, and step S2 is executed again.
If step S3 is executed, that is, the element label is a specific label, whether the currently traversed non-root element contains sub-elements is continuously judged. Wherein, the child element may be other elements that are nested inside the currently traversed non-root element.
Then, step S4 or S6 is performed. If step S4 is executed, that is, the currently traversed non-root element contains the sub-element, the sub-element is taken as the currently traversed non-root element, and step S2 is executed again. If step S6 is executed, that is, the currently traversed non-root element does not include the sub-element, the currently traversed non-root element is taken as a specific element. Step S4 is a recursion process when sub-elements are included, the sub-elements of the currently traversed non-root element are set as new traversal starting points, steps S2-S6 are re-executed for each sub-element, and after recursion is finished, other non-root elements are continuously traversed. Step S6 is end collection without sub-elements, and the currently traversed non-root element is used as a specific element.
In an alternative embodiment, the rendering parameter information corresponding to each of the plurality of material types is extracted from the information fragments corresponding to each of the plurality of material types, and the method comprises the steps of extracting path information of a material file and at least one attribute value of the material file from the information fragments for each of the information fragments, wherein the attribute value is used for rendering the material file, reading the material file according to the path information, determining the material type described by the information fragment according to the extension of the material file, taking at least one attribute to which the at least one attribute value belongs as at least one rendering parameter, and taking the at least one attribute value as a default parameter value of the at least one rendering parameter to obtain the rendering parameter information corresponding to the material type described by the information fragments.
The information fragment is a part which is extracted from the XML document and is related to the material type, and contains path information of the material file and at least one attribute value of the material file. Wherein the material files may be used to construct various media files that ultimately produce video. In the embodiment of the application, the material files can comprise, but are not limited to, files in formats such as MP4, AVI and the like, video files containing dynamic images and audios for displaying a series of continuous pictures, and audio files in formats such as WAV, MP3 and the like for providing contents such as background music, bystanding or special effect sounds.
At least one attribute value of the material file extracted from the information fragment is a specific value corresponding to the attribute so as to describe a specific state of the attribute. The attribute may describe a rendering effect of the material file. For example, for a material file being a video file, the attribute may be a video duration, a video speed, a filter effect, or the like, and if the attribute is a video duration, the attribute value may be 2 minutes, 1 hour, 1 day, or the like.
In an alternative embodiment, the rendering parameter information corresponding to the material type described by the information fragment includes at least one rendering parameter and a default parameter value of the at least one rendering parameter. The default parameter value may be an attribute value of an attribute to which the rendering parameter corresponds prior to quantization. At least one attribute of at least one attribute value of the extracted material file in the information fragment can be used as at least one rendering parameter, and the at least one attribute value is used as a default parameter value of the at least one rendering parameter, so that rendering parameter information is obtained. That is, the rendering parameters and default parameter values are combined into a key value pair set for controlling the rendering logic of the material files in the video so as to achieve the expected rendering effect.
In an alternative embodiment, the rendering parameter information corresponding to each of the plurality of material types is templated to obtain a plurality of materialization templates, wherein the templating comprises selecting at least one variable parameter from the rendering parameter information corresponding to each material type, associating a plurality of candidate parameter values for the at least one variable parameter, and generating the materialization template corresponding to the material type according to the plurality of candidate parameter values associated with the at least one variable parameter. The variable parameters are obtained by performing variable processing on rendering parameters with default parameter values, each variable parameter is associated with a plurality of candidate parameter values, and different candidate parameter values correspond to different rendering logics so as to generate different rendering effects in the generated video, and the variable parameters can be used for generating diversified materialization templates. For each material type, at least one variable parameter is selected from its corresponding rendering parameter information, which may be varied within an adjustment range to obtain different materialization templates. For example, when the material file is a video material, the play speed and the filter special effect may be selected as the variable parameters. The variable parameters are associated with a plurality of candidate selectable values, and the candidate selectable values are used for limiting the adjustment range of the variable parameters, so that the material files for generating the video are ensured to meet the design requirements, and invalid setting is avoided.
In an alternative embodiment, when at least one variable parameter is selected from the rendering parameter information corresponding to the material type, two selection modes are provided, namely, one mode is to take all rendering parameters as the variable parameters, and the other mode is to select part of rendering parameters as the variable parameters according to the weight values. When all the rendering parameters are selected as the variable parameters, the comparison of the weight values is not needed, and all the rendering parameters can be adjusted as the variable parameters.
In an alternative embodiment, the manner of selecting a part of rendering parameters as the variable parameters according to the weight values may be to select at least one variable parameter from the rendering parameter information corresponding to the material type, including configuring the weight values of the rendering parameters in advance for the material type, parsing out the rendering parameters from the rendering parameter information corresponding to the material type, and selecting at least one rendering parameter with the weight value greater than the set weight threshold from the rendering parameters as at least one variable parameter. The weight value can be used as an importance score allocated to each rendering parameter for quantifying the influence degree of the rendering parameters on the rendering effect of the material file in video generation, and at least one rendering parameter with larger influence on user perception or application targets can be screened out as at least one variable parameter. The weight threshold is a predefined critical value, a user screens the rendering parameters corresponding to the weight value higher than the weight threshold, the parameters with the weight value higher than the weight threshold can be selected as the variable parameters, and excessive irrelevant rendering parameters are prevented from being used as the variable parameters, so that the problems of configuration conflict or rendering logic confusion and the like are prevented.
In the embodiment of the present application, the manner of pre-configuring the weight values of the respective rendering parameters is not limited. For example, the weight values of the pre-configured rendering parameters may be annotated empirically or may be automatically generated by user behavior data analysis.
Optionally, at least one variable parameter is selected from the rendering parameter information corresponding to the material type, and may be selected randomly according to the set number of variable parameters. In an alternative embodiment, generating a materialization template corresponding to the material type according to a plurality of candidate parameter values associated with at least one variable parameter comprises adding each rendering parameter corresponding to the material type and a default parameter value of each rendering parameter to a preset template file, adding the plurality of candidate parameter values associated with at least one variable parameter to the preset template file, and adding a placeholder for carrying the material file corresponding to the material type to the preset template file to obtain the materialization template corresponding to the material type. The preset template file is a basic video template comprising basic configuration, and comprises a basic structure of the video template and a plurality of preset rendering parameters. And adding each rendering parameter corresponding to the material type and the default parameter value of each rendering parameter into a preset template file, so that the finally obtained materialized template can be ensured to have complete rendering parameter information.
In this embodiment, on the basis of adding each rendering parameter corresponding to the material type and a default parameter value of each rendering parameter to a preset template file, a plurality of candidate parameter values associated with at least one variable parameter are added to the preset template file. Wherein the candidate parameter values provide a plurality of choices allowing a user or system to select different candidate parameter values according to the needs.
In an optional embodiment, on the basis of adding a plurality of candidate parameter values associated with at least one variable parameter in a preset template file, a placeholder for carrying a material file corresponding to a material type may be added in the preset template file, so as to obtain a materialized template corresponding to the material type. The placeholder is a reserved position and is used for filling the material files, dynamic replacement is supported, and for example, path information of the actual material files corresponding to the material types can be filled. Different material files can be flexibly replaced by the placeholders without modifying the template structure.
In the above embodiment, the material template integrates the rendering parameters of the material type and default parameter values thereof into the preset basic template, so that the integrity of the rendering parameter information of the generated material template is ensured, meanwhile, the parameter values of the variable parameters are dynamically selected by associating a plurality of candidate parameter values for the variable parameters, so as to flexibly adjust the corresponding video style, the embedding of the placeholder further decouples the parameter configuration and the material file of the initial video template, and supports the dynamic replacement of the material file under the premise of not modifying the template structure, thereby allowing the corresponding candidate parameter values to be given to the variable parameters according to the application requirements, so as to obtain the freely combined material template, finally, the flexibility and the diversity of the video template generation are obviously improved, the video content with various styles is efficiently produced in batches, and the problems of complicated parameter configuration, complex adaptation process, serious homogeneity of the generated video content and the like of the video template in the traditional video production are solved.
Based on the multiple materialized templates, batch video generation or single video generation can be performed based on the multiple materialized templates. For batch video generation scenes, multiple videos with various styles and different contents can be generated by adopting the multiple materialization templates provided by the embodiment of the application.
The detailed implementation and the beneficial effects of each step in the method of this embodiment have been described in the foregoing embodiments, and will not be described in detail herein.
In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations appearing in a specific order are included, but it should be clearly understood that the operations may be performed out of the order in which they appear herein or performed in parallel, the sequence numbers of the operations such as 11, 12, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.
Fig. 3 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application. As shown in fig. 3, the device comprises a memory 34, a processor 35.
Memory 34 is used to store computer programs and may be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, a first video template, a first video, a target materialization template, and so forth.
The memory 34 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The processor 35 is coupled to the memory 34 and is configured to execute a computer program in the memory 34, and configured to download a first video template to a local terminal device, where the first video template is a video template for rendering a set of video materials to generate a first video, and the first video template includes a target materialization template adapted to a material type in the set of video materials from among a plurality of materialization templates, where the plurality of materialization templates are obtained by varying an initial video template, each materialization template is configured to describe a rendering rule of one video material, the first video template is configured to describe a rendering rule and a hierarchical relationship of the set of video materials, the hierarchical relationship is embodied as a positional relationship between the target materialization templates and the first video template, run video editing software on the terminal device, import the first video template into the video editing software, open the first video template in an editing area of the video editing software, and display the first video generated based on the first video template in a preview area of the video editing software, determine a modified rule in any one of the target materialization templates and/or a modified rule of any two target materialization templates and/or a modified relationship between the two target materialization templates according to the modified rule and/or the modified relationship of the first video templates, and the modified relationship is obtained after the first video template is modified and modified.
In an alternative embodiment, the processor 35 determines the modified rendering rule in any of the target materialized templates in response to a modification operation to the first video template, including exposing any target materialized template that needs to be modified to an edit area in response to a positioning operation to the first video template, obtaining the modified rendering rule in response to a code editing operation in any of the target materialized templates, or exposing the modified rendering rule to a first editing window in response to a triggering operation to modify any of the target materialized templates in the first video template, determining the name of any of the target materialized templates and the modified rendering rule in response to the first editing operation in the first editing window, and refreshing the first video template in accordance with the name of any of the target materialized templates and the modified rendering rule in response to a first editing submission operation.
In an alternative embodiment, the processor 35 determines a modified position relationship between any two target materialization templates in response to a modification operation on the first video template, and includes moving any two target materialization templates according to a drag track of the drag operation in response to the drag operation on any two target materialization templates in the first video template, determining another target materialization template according to a position when the drag operation is terminated, filling any two target materialization templates into a structural position where the another target materialization template is located, filling the other target materialization template into the structural position where the another target materialization template is located, so as to obtain a modified position relationship between any two target materialization templates, or displaying a second editing window in response to a trigger operation for performing position relationship adjustment on any two target materialization templates in the first video template, determining names of any two target materialization templates and information of structural positions after each other in response to the modification operation in the second editing window, refreshing the first target materialization templates according to the names of any two target materialization templates and the information of structural positions after each other, and performing a corresponding position update on the first target materialization templates in the first video template, wherein the modified position relationship between any two target materialization templates is described in the first video template.
In an alternative embodiment, the processor 35 downloads the first video template to the local terminal device, and includes displaying a batch video generation result page in response to an access operation to the batch video generation result page, where the batch video generation result page displays access links of N video instance identifiers of a target video template on a CDN network or a server, access links of N video instance identifiers of a group of video materials on the CDN network or the server, and access links of N videos of the video instance identifiers of the target video template on the CDN network or the server, where N is an integer greater than or equal to 2, and in response to a trigger operation to the access links of any target video template on the CDN network or the server, downloading any target video template from the CDN network or the server to the local terminal device as the first video template.
In an alternative embodiment, the processor 35 responds to a triggering operation of access links of a plurality of video materials on the CDN network or the server, downloads the plurality of video materials from the CDN network or the server to the local terminal device, wherein the plurality of video materials are distributed in one or more groups of video materials, runs video editing software on the terminal device, imports the plurality of video materials into the video editing software, displays the plurality of video materials in an editing area of the video editing software, responds to the editing operation of the plurality of video materials, generates a third video, and generates a second video template corresponding to the third video according to attribute information and hierarchical relation after the editing of the plurality of video materials, wherein the second video template comprises rendering rules of the plurality of video materials and the positional relation of the plurality of video materials in the second video template.
In an alternative embodiment, the processor 35 responds to input operations on the video generation page for the number of videos and the video category, generates a batch video generation task, wherein the batch video generation task comprises a video number N and a video category, generates N video instance identifications according to the batch video generation task, generates a group of video materials related to the video category for each video instance identification, acquires a plurality of materialized templates obtained by performing a transformation processing on the initial video template, wherein the initial video template comprises a rendering rule of the plurality of video materials required for generating videos and a hierarchical relationship among the plurality of video materials, determines at least one target materialized template from the plurality of materialized templates according to the material types in a group of video materials corresponding to the video instance identifications for each video instance identification, combines the at least one target materialized template based on the hierarchical relationship among the plurality of video materials included in the initial video template to obtain a target video template corresponding to the video instance identifications, wherein the target video template is used for describing the rendering rule and the hierarchical relationship of the group of video materials of the video materials, performs a processing on the target video template corresponding to the N video instance identifications and the group of video materials, and the video instance identifications are generated to obtain the corresponding target video templates and the video source materials corresponding to the N video instance identifications, and the video instance identifications are stored at the network end or the network service end.
In an alternative embodiment, the processor 35 uses the material types as splitting variables to analyze the initial video template to obtain information segments corresponding to the material types, extracts rendering parameter information corresponding to the material types from the information segments corresponding to the material types, and templates the rendering parameter information corresponding to the material types to obtain a plurality of materialized templates.
Further, as shown in FIG. 3, the electronic device also includes a communication component 36, a display 37, a power supply component 38, an audio component 39, and other components. Only some of the components are schematically shown in fig. 3, which does not mean that the computing platform only includes the components shown in fig. 3. In addition, the components within the dashed box in fig. 3 are optional components, and not necessarily optional components, depending on the product form of the working node. The working node of the embodiment can be implemented as terminal equipment such as a desktop computer, a notebook computer, a smart phone or an IOT device, and also can be a server-side device such as a conventional server, a cloud server or a server array. If the working node of the embodiment is implemented as a terminal device such as a desktop computer, a notebook computer, or a smart phone, the working node may include the components in the dashed-line frame in fig. 3, and if the working node of the embodiment is implemented as a server device such as a conventional server, a cloud server, or a server array, the working node may not include the components in the dashed-line frame in fig. 3.
Accordingly, the present application also provides a computer readable storage medium storing a computer program, where the computer program is executed to implement the steps executable by the electronic device in the above method embodiments.
The Memory may be implemented by any type or combination of volatile or non-volatile Memory devices, such as Static Random-Access Memory (SRAM), electrically erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ ONLY MEMORY, EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The communication component is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a mobile communication network of WiFi,2G, 3G, 4G/LTE, 5G, etc., or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the Communication component further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, in the NFC module, it may be implemented based on radio frequency identification (Radio Frequency Identification, RFID) technology, infrared data Association (IrDA) technology, ultra Wide Band (UWB) technology, blueTooth (BT) technology, and other technologies.
The display includes a screen, which may include a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation.
The power supply component provides power for various components of equipment where the power supply component is located. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the devices in which the power components are located.
The audio component described above may be configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive external audio signals when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, magnetic disk storage, CD-ROM (Compact Disc Read-Only Memory), optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (Central Processing Unit, CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random access memory (Random Access Memory, RAM) and/or non-volatile memory, etc., such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase-change memory (Phase-change Random Access Memory, PRAM), static Random Access Memory (SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital versatile disks (Digital Video Disc, DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (10)

1.一种视频编辑方法,其特征在于,包括:1. A video editing method, comprising: 下载第一视频模板至本地终端设备,所述第一视频模板是对一组视频素材进行渲染生成第一视频所使用的视频模板,所述第一视频模板包括:多种素材化模板中与所述一组视频素材中的素材类型适配的目标素材化模板,所述多种素材化模板是对初始视频模板进行变化量得到的;每种素材化模板用于描述一种视频素材的渲染规则,所述第一视频模板用于描述所述一组视频素材的渲染规则和层级关系,所述层级关系体现为所述目标素材化模板在所述第一视频模板之间的位置关系;Downloading a first video template to a local terminal device, the first video template is a video template used to render a group of video materials to generate a first video, the first video template comprising: a target material template adapted to a material type in the group of video materials among multiple material templates, the multiple material templates being obtained by varying an initial video template; each material template being used to describe a rendering rule for a type of video material, the first video template being used to describe a rendering rule and a hierarchical relationship for the group of video materials, the hierarchical relationship being embodied as a positional relationship between the target material template and the first video template; 运行所述终端设备上的视频编辑软件,将所述第一视频模板导入所述视频编辑软件,在所述视频编辑软件的编辑区域打开所述第一视频模板,并在所述视频编辑软件的预览区域显示基于所述第一视频模板生成的所述第一视频;Running the video editing software on the terminal device, importing the first video template into the video editing software, opening the first video template in the editing area of the video editing software, and displaying the first video generated based on the first video template in the preview area of the video editing software; 响应于对所述第一视频模板的修改操作,确定任一目标素材化模板中被修改的渲染规则和/或任意两个目标素材化模板之间被修改的位置关系,并根据所述被修改的渲染规则和/或被修改的位置关系,对所述第一视频进行编辑,以得到编辑后的第二视频。In response to a modification operation on the first video template, the modified rendering rules in any target materialization template and/or the modified positional relationship between any two target materialization templates are determined, and the first video is edited according to the modified rendering rules and/or the modified positional relationship to obtain an edited second video. 2.根据权利要求1所述的方法,其特征在于,响应于对所述第一视频模板的修改操作,确定任一目标素材化模板中被修改的渲染规则,包括:2. The method according to claim 1, characterized in that, in response to the modification operation on the first video template, determining the modified rendering rule in any target materialization template comprises: 响应于对所述第一视频模板的定位操作,将需要被修改的任一目标素材化模板展示在所述编辑区域;响应于在所述任一目标素材化模板中的代码编辑操作,获取被修改的渲染规则;In response to the positioning operation on the first video template, any target materialization template that needs to be modified is displayed in the editing area; in response to the code editing operation in any target materialization template, a modified rendering rule is obtained; 或者or 响应于对所述第一视频模板中任一目标素材化模板进行修改的触发操作,展示在第一编辑窗口;响应于在所述第一编辑窗口中的第一编辑操作,确定所述任一目标素材化模板的名称以及被修改的渲染规则;响应于第一编辑提交操作,根据所述任一目标素材化模板的名称以及被修改的渲染规则,对所述第一视频模板进行刷新。In response to a trigger operation to modify any target materialization template in the first video template, it is displayed in a first editing window; in response to a first editing operation in the first editing window, the name of any target materialization template and the modified rendering rule are determined; in response to the first editing submission operation, the first video template is refreshed according to the name of any target materialization template and the modified rendering rule. 3.根据权利要求1所述的方法,其特征在于,响应于对所述第一视频模板的修改操作,确定任意两个目标素材化模板之间被修改的位置关系,包括:3. The method according to claim 1, characterized in that, in response to the modification operation on the first video template, determining the modified positional relationship between any two target material templates comprises: 响应于对所述第一视频模板中任一目标素材化模板的拖动操作,根据所述拖动操作的拖动轨迹移动所述任一目标素材化模板,并根据所述拖动操作终止时的位置,确定另一目标素材化模板;将所述任一素材化模板填充至所述另一目标素材化模板所在的结构位中,并将所述另一目标素材化模板填充至所述任一目标素材化模板的结构位中,以得到任意两个目标素材化模板之间被修改的位置关系;In response to a drag operation on any target materialization template in the first video template, the target materialization template is moved according to a dragging track of the drag operation, and another target materialization template is determined according to a position when the drag operation is terminated; the target materialization template is filled into a structure position where the other target materialization template is located, and the other target materialization template is filled into a structure position of the target materialization template, so as to obtain a modified positional relationship between any two target materialization templates; 或者or 响应于对所述第一视频模板中任意两个目标素材化模板进行位置关系调整的触发操作,展示第二编辑窗口;响应于所述第二编辑窗口中的修改操作,确定所述任意两个目标素材化模板的名称以及相互调换后的结构位的信息;响应于修改提交操作,根据所述任意两个目标素材化模板的名称以及相互调换后的结构位的信息,对所述第一视频模板进行刷新,以得到任意两个目标素材化模板之间被修改的位置关系;In response to a trigger operation of adjusting the positional relationship between any two target materialization templates in the first video template, a second editing window is displayed; in response to a modification operation in the second editing window, the names of the any two target materialization templates and information of the swapped structural positions are determined; in response to a modification submission operation, the first video template is refreshed according to the names of the any two target materialization templates and information of the swapped structural positions, so as to obtain a modified positional relationship between the any two target materialization templates; 其中,所述结构位是指描述对应目标素材化模板在所述第一视频模板中的位置。The structure position refers to a position describing the corresponding target material template in the first video template. 4.根据权利要求1-3任一项所述的方法,其特征在于,下载第一视频模板至本地终端设备,包括:4. The method according to any one of claims 1 to 3, characterized in that downloading the first video template to the local terminal device comprises: 响应于对批量视频生成结果页面的访问操作,展示批量视频生成结果页面,所述批量视频生成结果页面上展示有N个视频实例标识各自对应的目标视频模板在CDN网络或服务端上的访问链接、N个视频实例标识各自对应的一组视频素材在CDN网络或服务端上的访问链接以及N个视频实例标识各自对应的视频在CDN网络或服务端上的访问链接;N是≥2的整数;In response to an access operation to a batch video generation result page, a batch video generation result page is displayed, wherein the batch video generation result page displays access links of target video templates corresponding to each of the N video instance identifiers on the CDN network or the server, access links of a group of video materials corresponding to each of the N video instance identifiers on the CDN network or the server, and access links of videos corresponding to each of the N video instance identifiers on the CDN network or the server; N is an integer ≥ 2; 响应于对任一目标视频模板在CDN网络或服务端上的访问链接的触发操作,将所述任一目标视频模板作为所述第一视频模板,从所述CDN网络或服务端下载至本地终端设备。In response to a triggering operation of an access link of any target video template on a CDN network or a server, the any target video template is used as the first video template and downloaded from the CDN network or the server to a local terminal device. 5.根据权利要求4所述的方法,其特征在于,还包括:5. The method according to claim 4, further comprising: 响应于对多个视频素材在CDN网络或服务端上的访问链接的触发操作,将所述多个视频素材从所述CDN网络或服务端下载至本地终端设备;其中,所述多个视频素材分布在一组或多组视频素材中;In response to a triggering operation of accessing links of multiple video materials on a CDN network or a server, downloading the multiple video materials from the CDN network or the server to a local terminal device; wherein the multiple video materials are distributed in one or more groups of video materials; 运行所述终端设备上的视频编辑软件,将所述多个视频素材导入所述视频编辑软件,在所述视频编辑软件的编辑区域展示所述多个视频素材;Running the video editing software on the terminal device, importing the multiple video materials into the video editing software, and displaying the multiple video materials in the editing area of the video editing software; 响应于对所述多个视频素材的编辑操作,生成第三视频,并根据所述多个视频素材编辑后的属性信息和层级关系,生成所述第三视频对应第二视频模板,所述第二视频模板包括所述多个视频素材的渲染规则以及所述多个视频素材在所述第二视频模板中的位置关系。In response to the editing operation on the multiple video materials, a third video is generated, and based on the edited attribute information and hierarchical relationship of the multiple video materials, a second video template corresponding to the third video is generated, and the second video template includes the rendering rules of the multiple video materials and the positional relationship of the multiple video materials in the second video template. 6.根据权利要求4所述的方法,其特征在于,还包括:6. The method according to claim 4, further comprising: 响应视频生成页面上针对视频数量和视频类目的输入操作,生成批量视频生成任务,所述批量视频生成任务包括视频数量N和视频类目;In response to an input operation on the video generation page for the number of videos and the video category, a batch video generation task is generated, wherein the batch video generation task includes the number of videos N and the video category; 根据所述批量视频生成任务,生成N个视频实例标识,并为每个视频实例标识生成与所述视频类目相关的一组视频素材;Generate N video instance identifiers according to the batch video generation task, and generate a group of video materials related to the video category for each video instance identifier; 获取对初始视频模板进行变量化处理得到的多种素材化模板,所述初始视频模板包括生成视频所需的多种视频素材的渲染规则以及多种视频素材之间的层级关系;Acquire multiple material templates obtained by performing variable processing on an initial video template, wherein the initial video template includes rendering rules of multiple video materials required for generating a video and hierarchical relationships between the multiple video materials; 针对每个视频实例标识,根据所述视频实例标识对应的一组视频素材中的素材类型,从所述多种素材化模板中确定至少一种目标素材化模板;基于所述初始视频模板中包括的多种视频素材之间的层级关系,对所述至少一种目标素材化模板进行组合,以得到所述视频实例标识对应的目标视频模板,所述目标视频模板用于描述所述一组视频素材的渲染规则和层级关系;For each video instance identifier, at least one target materialization template is determined from the multiple materialization templates according to the material type in the group of video materials corresponding to the video instance identifier; based on the hierarchical relationship between the multiple video materials included in the initial video template, the at least one target materialization template is combined to obtain a target video template corresponding to the video instance identifier, the target video template being used to describe the rendering rules and hierarchical relationship of the group of video materials; 根据N个视频实例标识各自对应的目标视频模板和一组视频素材进行视频生成处理,以得到所述视频类目下的N个视频;Performing video generation processing according to target video templates and a group of video materials corresponding to the N video instance identifiers, so as to obtain N videos under the video category; 将所述N个视频实例标识各自对应的目标视频模板、一组视频素材和视频上传至CDN网络或服务端进行存储。The target video templates, a group of video materials and videos corresponding to the N video instance identifiers are uploaded to the CDN network or the server for storage. 7.根据权利要求1-3或6中任一项所述的方法,其特征在于,还包括:7. The method according to any one of claims 1 to 3 or 6, further comprising: 以素材类型作为拆分变量,对所述初始视频模板进行解析,得到多种素材类型对应的信息片段;Taking the material type as a splitting variable, parsing the initial video template to obtain information fragments corresponding to multiple material types; 从所述多种素材类型各自对应的信息片段,分别提取所述多种素材类型各自对应的渲染参数信息;Extracting rendering parameter information corresponding to each of the multiple material types from the information fragments corresponding to each of the multiple material types; 将所述多种素材类型各自对应的渲染参数信息进行模板化,得到所述多种素材化模板。The rendering parameter information corresponding to each of the multiple material types is templated to obtain the multiple material templates. 8.一种电子设备,其特征在于,包括:处理器和存储器,所述存储器于存储计算机程序,当所述计算机程序被所述处理器执行时,致使所述处理器能够实现如权利要求1-7任一项所述方法中的步骤。8. An electronic device, comprising: a processor and a memory, wherein the memory is used to store a computer program, and when the computer program is executed by the processor, the processor is enabled to implement the steps in the method as claimed in any one of claims 1 to 7. 9.一种存储有计算机程序的计算机可读存储介质,其特征在于,当所述计算机程序被处理器执行时,致使所述处理器能够实现权利要求1-7任一项所述方法中的步骤。9. A computer-readable storage medium storing a computer program, characterized in that when the computer program is executed by a processor, the processor is enabled to implement the steps in the method according to any one of claims 1 to 7. 10.一种计算机程序产品,其特征在于,包括计算机程序/指令,当所述计算机程序/指令被处理器执行时,致使所述处理器能够实现权利要求1-7任一项所述方法中的步骤。10. A computer program product, characterized in that it comprises a computer program/instruction, and when the computer program/instruction is executed by a processor, the processor is enabled to implement the steps in the method according to any one of claims 1 to 7.
CN202510443784.3A 2025-04-09 2025-04-09 Video editing method, device, storage medium and program product Pending CN120186431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510443784.3A CN120186431A (en) 2025-04-09 2025-04-09 Video editing method, device, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510443784.3A CN120186431A (en) 2025-04-09 2025-04-09 Video editing method, device, storage medium and program product

Publications (1)

Publication Number Publication Date
CN120186431A true CN120186431A (en) 2025-06-20

Family

ID=96032473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510443784.3A Pending CN120186431A (en) 2025-04-09 2025-04-09 Video editing method, device, storage medium and program product

Country Status (1)

Country Link
CN (1) CN120186431A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112040271A (en) * 2020-09-04 2020-12-04 杭州七依久科技有限公司 Cloud intelligent editing system and method for visual programming
CN113315883A (en) * 2021-05-27 2021-08-27 北京达佳互联信息技术有限公司 Method and device for adjusting video combined material
CN117793271A (en) * 2023-11-14 2024-03-29 阿里巴巴(中国)网络技术有限公司 Video synthesis method and electronic equipment
CN117939255A (en) * 2023-12-27 2024-04-26 北京三快在线科技有限公司 Video generation method, device, equipment and computer readable storage medium
WO2025020416A1 (en) * 2023-07-26 2025-01-30 北京字跳网络技术有限公司 Video editing method and apparatus, and device and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112040271A (en) * 2020-09-04 2020-12-04 杭州七依久科技有限公司 Cloud intelligent editing system and method for visual programming
CN113315883A (en) * 2021-05-27 2021-08-27 北京达佳互联信息技术有限公司 Method and device for adjusting video combined material
WO2025020416A1 (en) * 2023-07-26 2025-01-30 北京字跳网络技术有限公司 Video editing method and apparatus, and device and medium
CN117793271A (en) * 2023-11-14 2024-03-29 阿里巴巴(中国)网络技术有限公司 Video synthesis method and electronic equipment
CN117939255A (en) * 2023-12-27 2024-04-26 北京三快在线科技有限公司 Video generation method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN102256049B (en) Automation story generates
US11978148B2 (en) Three-dimensional image player capable of real-time interaction
US6791581B2 (en) Methods and systems for synchronizing skin properties
CN111930994A (en) Video editing processing method and device, electronic equipment and storage medium
CN111372109B (en) A kind of smart TV and information interaction method
CN101639943B (en) Method and apparatus for producing animation
US20180143741A1 (en) Intelligent graphical feature generation for user content
US20240177739A1 (en) Video editing method and apparatus, computer device, storage medium, and product
KR20170078651A (en) Authoring tools for synthesizing hybrid slide-canvas presentations
CN106294612A (en) A kind of information processing method and equipment
JP7177175B2 (en) Creating rich content from text content
CN118012309A (en) Interactive information processing method, device and storage medium
CN118803173B (en) Video generation method, device, electronic equipment, storage medium and product
CN116170626A (en) Video editing method, device, electronic equipment and storage medium
CN119893165A (en) Method and device for generating video based on text
CN119299800A (en) Video generation method, device, computing device, storage medium and program product
CN118695044A (en) Method, device, computer equipment, readable storage medium and program product for generating promotional video
WO2025001722A1 (en) Server, display device and digital human processing method
KR20240162355A (en) Content providing method and electronic device
CN120186431A (en) Video editing method, device, storage medium and program product
CN115770387B (en) Text data processing method, device, equipment and medium
CN116366762A (en) Method, device, equipment and storage medium for setting beautifying materials
CN120186432A (en) Video template variable method, device, storage medium and program product
CN117008794A (en) Video generation method, device, equipment and medium
CN120186430A (en) Video batch generation method, device, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination