CN109391849B

CN109391849B - Processing method and system, multimedia output device and memory

Info

Publication number: CN109391849B
Application number: CN201811165319.4A
Authority: CN
Inventors: 杨茂
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2018-09-30
Filing date: 2018-09-30
Publication date: 2020-11-20
Anticipated expiration: 2038-09-30
Also published as: CN109391849A

Abstract

The disclosure provides a processing method, comprising obtaining multimedia data, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface; processing multimedia data and determining target data; wherein the target data is data that can be perceived when output by the multimedia output device; acquiring identification data corresponding to target data; the identification data is processed such that the identification data can be associated with a perception when the target data is perceived. The disclosure also provides a processing system, a multimedia output device and a memory.

Description

Processing method and system, multimedia output device and memory

Technical Field

The disclosure relates to a processing method and system, a multimedia output device and a memory.

Background

With the rapid development of electronic technology, various output devices are developed. In some application scenarios, a user may obtain various types of multimedia data through an output device. In the process of implementing the embodiment of the present disclosure, the inventor finds that output devices in the related art generally can only passively receive multimedia data and directly output the received multimedia data, but cannot edit, analyze and process the received multimedia data, so that user experience is poor.

Disclosure of Invention

One aspect of the present disclosure provides a processing method, including: obtaining multimedia data, wherein the multimedia data is transmitted to a multimedia output device through a multimedia interface; processing the multimedia data and determining target data; wherein the target data is data which can be sensed when being output by the multimedia output device; obtaining identification data corresponding to the target data; processing said identification data such that said identification data is associated with a perception that said target data is perceived.

Optionally, processing the multimedia data, and determining the target data includes: acquiring first input information corresponding to the operation; and matching the first input information with the multimedia data to determine the target data.

Optionally, processing the multimedia data, and determining the target data includes: acquiring second input information corresponding to the operation; acquiring extended information having an association relation with the second input information; and matching the second input information and the extension information with the multimedia data to determine the target data.

Optionally, processing the multimedia data to determine target data; obtaining identification data corresponding to the target data; processing the identification data includes: respectively executing the following steps aiming at each frame of image in the multimedia data: processing a frame of image; if the frame of image includes target data, identification data corresponding to the target data is obtained, and the identification data is processed so that the identification data can be perceived in association with the target data.

Optionally, processing the multimedia data to determine target data; obtaining identification data corresponding to the target data; processing the identification data includes: executing the following steps for one frame of image in the multimedia data: processing a frame of image; if the frame of image comprises target data, acquiring identification data corresponding to the target data, and processing the identification data to display the target data and the identification data in a correlation manner; executing the following steps for the next frame of image of the frame of image: and processing the next frame image, if the matching degree of the next frame image and the one frame image meets the condition, determining the movement parameters of the target object in the next frame image and the one frame image, and determining the display of the identification data in the next frame image according to the movement parameters so that the target data and the identification data are still associated to be displayed.

Optionally, wherein: the perceived mode of the target data is the same as or different from the perceived mode of the identification data corresponding to the target data.

Another aspect of the present disclosure provides a processing system comprising: the first acquisition module is used for acquiring multimedia data, wherein the multimedia data is transmitted to the multimedia output device through a multimedia interface; the first processing module is used for processing the multimedia data and determining target data; wherein the target data is data which can be sensed when being output by the multimedia output device; the second acquisition module is used for acquiring identification data corresponding to the target data; and the second processing module is used for processing the identification data so that the identification data can be associated when the target data is sensed.

Optionally, the first processing module includes a first obtaining unit, configured to obtain first input information corresponding to the operation; and the first matching unit is used for matching the first input information with the multimedia data so as to determine the target data.

Optionally, the first processing module includes: the second acquisition unit is used for acquiring second input information corresponding to the operation; a third acquiring unit configured to acquire extended information having an association relationship with the second input information; and a second matching unit, configured to match the second input information and the extension information with the multimedia data to determine the target data.

Optionally, wherein: the first processing module is further configured to perform processing on one frame of image for each frame of image in the multimedia data; the second obtaining module is further configured to obtain identification data corresponding to the target data if the frame of image includes the target data; the second processing module is further configured to process the identification data so that the identification data can be perceived in association with the target data when perceived.

Optionally, wherein: the first processing module is further configured to execute a frame of image processing on the multimedia data; the second obtaining module is further configured to obtain identification data corresponding to the target data if the frame of image includes the target data; the second processing module is further configured to process the identification data so that the target data and the identification data are displayed in a correlated manner; the first processing module is further configured to perform processing on a next frame image of the one frame image, and determine a movement parameter of a target object in the next frame image and the one frame image if a matching degree between the next frame image and the one frame image satisfies a condition; the second processing module is further configured to determine display of the identification data in a next frame of image according to the movement parameter, so that the target data and the identification data are still displayed in a correlated manner.

Yet another aspect of the present disclosure provides a multimedia output apparatus including: a multimedia interface; a processor; a memory for storing one or more programs, wherein the one or more programs, when executed by the processor, cause the processor to implement the processing method as described above.

Yet another aspect of the present disclosure provides a memory having one or more programs stored thereon, which when executed by a processor, cause the processor to implement the processing method as described above.

A further aspect of the disclosure provides a computer program comprising computer executable instructions which when executed are for implementing a method as described above.

Drawings

For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

FIG. 1 schematically illustrates an application scenario of a processing method and system according to an embodiment of the present disclosure;

FIG. 2 schematically shows a flow chart of a processing method according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart for processing multimedia data to determine target data according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart for processing multimedia data to determine target data according to another embodiment of the present disclosure;

FIG. 5 schematically shows a block diagram of a processing system according to an embodiment of the disclosure;

FIG. 6 schematically shows a block diagram of a first processing module according to an embodiment of the disclosure;

FIG. 7 schematically shows a block diagram of a first processing module according to another embodiment of the present disclosure; and

fig. 8 schematically shows a block diagram of a multimedia output device adapted to implement the above described method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a memory having instructions stored thereon for use by or in connection with an instruction execution system.

The embodiment of the disclosure provides a processing method and a system, the method comprises the steps of obtaining multimedia data, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface; processing multimedia data and determining target data; wherein the target data is data that can be perceived when output by the multimedia output device; acquiring identification data corresponding to target data; the identification data is processed such that the identification data can be associated with a perception when the target data is perceived.

Fig. 1 schematically illustrates an application scenario of a processing method and system according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.

As shown in fig. 1, in this application scenario, taking a multimedia output device such as a television as an example, a user can watch various programs through the television, for example, a sports event such as a football game through the television, but for a general football fan or under the influence of a large number of players, it is difficult to clearly distinguish an object that the user wants to know, such as a favorite sports star, from a video image. According to the embodiment of the disclosure, after the multimedia data is received through the multimedia interface, the multimedia data can be analyzed to obtain the target data, then the identification data corresponding to the target data is obtained, and the identification data is processed.

For example, after receiving a live video stream of a football match with the player in the plum via the multimedia interface of the television, the video stream may be analyzed to determine the player in the plum that the user wants to focus on. In order to clearly distinguish the plum in the video picture, the ball size on the plum clothing can be processed, for example, the ball size is displayed on a screen in a way of amplifying the ball size proportion, so that the player can perceive the movement of the plum by the association of the amplified ball size. Or, the plum-shaped character pattern is marked on the top of the plum-shaped image, so that the user can know the movement of the plum through the plum-shaped character pattern.

It should be noted that, in the related art, before the multimedia data is transmitted, the original data is generally processed by the background or the tv station, and generally processed by a manual post-production method, and then the pre-processed data is directly output by the terminal device, instead of analyzing and processing the multimedia data by the terminal device while or after the terminal device receives the multimedia data. According to the embodiment of the disclosure, the multimedia data is analyzed and processed through the multimedia output device to determine the target data, then the identification data corresponding to the target data is obtained, and the identification data is processed, so that the target data which the user wants to pay attention to can be clearly distinguished from the multimedia data (such as pictures or audios which can be perceived), the multimedia output device is more intelligent, the manufacturing cost is low, the user can pay attention to the object which the user wants to know in time, and the user experience is improved.

It is to be understood that the types of the multimedia output device, the multimedia data, the target data and the identification data in the above examples are only exemplary, and the present disclosure does not limit the types of the multimedia output device, the multimedia data, the target data and the identification data, wherein the target data may be extracted from the multimedia data, and the type of the target data may be at least partially identical to the multimedia data.

According to embodiments of the present disclosure, the target data and the identification data types may be the same or different. In the case of the same type, for example, the target data may be image content, and the image content may be superimposed with identification data such as image markers, thereby making it possible to view the image content more specifically. In the case where the types are not the same, for example, the target data may be image content, and the user may be reminded of the image content with identification data such as a sound mark or a light mark. Or the target data may be audio content that may be tagged with an image such that the image tags the audio without interfering with the playing of the audio content.

FIG. 2 schematically shows a flow chart of a processing method according to an embodiment of the disclosure.

As shown in fig. 2, the method includes operations S210 to S240.

In operation S210, multimedia data is obtained, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface.

According to an embodiment of the present disclosure, the category of the multimedia data may include one or more of, for example, video data, audio data, and the like. The variety of multimedia output devices may also include one or more of, for example, a set-top box, a desktop, a kiosk, a notebook, a television, and the like. The setting mode between the multimedia interface and the multimedia output device is not limited, and the multimedia interface can be arranged inside or outside the media output device. For example, the media output device is a set-top box/desktop, the multimedia interface may be an external interface of the apparatus, the media output device is a kiosk/notebook/television, and the multimedia interface may be an internal interface of the apparatus.

In operation S220, multimedia data is processed to determine target data, wherein the target data is data that can be perceived when output by a multimedia output device.

According to the embodiment of the disclosure, the multimedia output device can process the multimedia data according to the input operation of the user and determine the target data. For example, in the process of outputting data, the multimedia output device may input a selection operation for the data just output in the multimedia data if the user is interested in the data just output, and after acquiring the selection operation, the multimedia output device may determine keyword data in response to the selection operation, and identify the matching target data by recognizing the keyword data. For another example, in the process of outputting data by the multimedia output device, if the user is interested in the content of the multimedia data with strong directivity, the user can input the content with strong directivity in the multimedia data to perform local selection, so as to determine keyword data, and determine the matched target data by identifying the keyword data. Specifically, for example, when the television is playing a world cup football game on the air, and the game is wonderful, the player may pass the football in the feet to a teammate, and the teammate holds the football to take an attack, but the user does not know the player, and at the wonderful moment, the user may lock the player by the television through the selection operation for the player to determine the player as the target data.

According to the embodiment of the disclosure, the user can also directly input the keyword data in a mode of uploading images and sounds, so that the multimedia output device determines the target data according to the keyword data. Specifically, for example, taking the case that the user uploads the image as a plum picture, the uploaded plum picture may be matched with the content in the multimedia data, so as to determine the target data about the plum in the multimedia data.

According to the embodiment of the present disclosure, the manner in which the multimedia output device can be perceived when outputting the target data includes one or more of, for example, a visual perception manner, an auditory perception manner, an olfactory perception manner, and the like. Taking a visual perception manner as an example, the target data may be displayed on the display unit, may also be displayed on other media, and may also be sent to the content terminal for displaying. Or, taking the auditory perception manner as an example, the target data may be played through the playing unit, or played on another medium, or sent to the playing terminal for playing. The specific manner of sensing is not limited thereto.

In operation S230, identification data corresponding to the target data is obtained.

According to an embodiment of the present disclosure, the manner of obtaining the identification data of the corresponding target data may include one or more of, for example, obtaining the identification data of the corresponding target data from a resource package stored in the multimedia output device, and obtaining the identification data from a network/server. The specific acquisition mode is not limited herein. According to the embodiment of the disclosure, the identification data includes various types, and taking the target data as basketball star book luxury as an example, the identification data may be one or more of the name of the book luxury, a photo and a ball number. Taking the target data as the interpretive star the interpretive sound of Yunypeng as an example, the identification data may be a photo of Yunypeng himself. The present disclosure does not limit the kind of identification data.

In operation S240, the identification data is processed such that the identification data can be associated with perception when the target data is perceived.

According to the embodiment of the present disclosure, the perceived manner of the target data and the perceived manner of the corresponding identification data may be the same or different. For example, in the case where the target data is perceived in the same manner as the corresponding identification data, both may be perceived in a visual manner. Specifically, for example, the multimedia data may be a video stream of a live show, the target data may be a certain target star, the identification data may be a name of the target star, an image of the target star may be presented when the target data is output through the multimedia output device, and the name of the target star is presented in association. Alternatively, in the case where the target data is perceived in the same manner as the corresponding identification data, both may be perceived in an auditory manner. For example, the multimedia data may be audio in which a plurality of vocal actors speak, the target data may be a sound of a certain target vocal actor, and the identification data may be a piece of background music, the sound of the target vocal actor is played by voice when the target data is output through the multimedia output device, and the background music is played while the sound of the target vocal actor is played, so that the identification data can be perceived in association when the target data is perceived.

According to an embodiment of the present disclosure, in a case where a perceived manner of target data and a perceived manner of corresponding identification data are different, it may be perceived in a mixed manner of visual and auditory. Specifically, taking the video stream of the live show as an example, when the target data is output through the multimedia output device, the image of the target star can be displayed, and the name of the target star can be broadcasted through voice broadcasting.

Through the embodiment of the disclosure, the multimedia data is analyzed and processed to determine the target data, the identification data is added to the target data, and the target data which the user wants to pay attention to can be clearly distinguished from the multimedia data based on the identification data. Compared with the post-production artificial marking mode adopted in the related technology, the multimedia output device is more intelligent, the production cost is low, the user can pay attention to the object which the user wants to know in time, and the user experience is improved.

The method shown in fig. 2 is further described with reference to fig. 3-4 in conjunction with specific embodiments.

Fig. 3 schematically shows a flow chart for processing multimedia data to determine target data according to an embodiment of the present disclosure.

As shown in fig. 3, processing multimedia data and determining target data includes operations S221 to S222.

In operation S221, first input information corresponding to the operation is acquired.

According to the embodiment of the disclosure, the first input information corresponding to the operation may be voice, text, picture input by the user, or screenshot corresponding to the screenshot operation, and the like. The present disclosure does not limit the kind and input manner of the first input information.

In operation S222, the first input information is matched with the multimedia data to determine target data.

According to the embodiment of the disclosure, taking the input first input information as an example, the input picture may be matched with the content in the multimedia data, so as to determine the target data, which may also be a picture. Or, taking the first input information as a segment of voice as an example, the input voice may be matched with the content in the multimedia data, so as to determine the target data, where the determined target data may be voice. Specifically, for example, the user inputs a section of speech of liu de hua, matches the speech of liu de hua with multimedia data (e.g., a section of audio), and determines that the part of the multimedia data that belongs to the speech of liu de hua is the target data.

Through the embodiment of the disclosure, when a user is interested in certain target data, the first input information can be input, the multimedia output device can match the first input information with the multimedia data, so that the user can pay attention to the object which the user wants to know in time, and the user experience is improved.

Fig. 4 schematically shows a flow chart for processing multimedia data to determine target data according to another embodiment of the present disclosure.

As shown in fig. 4, processing multimedia data and determining target data includes operations S223 to S225.

In operation S223, second input information corresponding to the operation is acquired.

In operation S224, extension information having an association relationship with the second input information is acquired.

According to the embodiment of the disclosure, the type of the second input information is not limited, for example, the second input information may be a voice, a text, a picture, or a screenshot corresponding to a screenshot operation, and the like. According to the embodiment of the present disclosure, the manner of obtaining the extension information having the association relation with the second input information is not limited, and for example, the extension information may be obtained from a resource package stored in the multimedia output apparatus, or may be obtained from a network/server. The specific acquisition mode is not limited herein.

The second input information and the extension information are matched with the multimedia data to determine target data in operation S225.

According to an embodiment of the present disclosure, for example, the second input information is the player's plum, and extended information associated with the plum, such as a picture of the plum, a ball size, running characteristics, a height, etc., may be obtained from a local or server. And then matching the picture and the ball number information of the player Meixi and the Meixi with the multimedia data so as to match the Meixi image contained in the multimedia data.

According to the embodiment of the disclosure, the input information is expanded to obtain the expanded information with the incidence relation, the target data is determined according to the input information and the expanded information with the incidence relation, the labeling scene can be expanded, the target data can be effectively determined, and therefore the probability of successful labeling is improved.

According to the embodiment of the disclosure, multimedia data is processed, and target data is determined; acquiring identification data corresponding to target data; processing the identification data includes: respectively executing the following steps for each frame of image in the multimedia data: processing a frame of image; if the frame of image includes target data, identification data corresponding to the target data is obtained, and the identification data is processed so that the identification data can be perceived in association with the target data when perceived.

According to the embodiment of the disclosure, in the case of perception that the identification data can be associated when the target data is perceived, after the multimedia data is acquired, the operations of processing the frame image, analyzing whether the frame image includes the target data, acquiring the identification data of the target data for each frame image including the target data, and processing the identification data may be performed separately for each frame image, that is, may be processed once per the aforementioned operations for each frame image.

For example, while the multimedia output apparatus acquires the multimedia data through the multimedia interface, for each frame image in the multimedia data, the first input information is matched with each frame image to determine whether the frame image contains the target data, or the second input information and the extension information are matched with each frame image to determine whether the frame image contains the target data.

Specifically, for example, taking a multimedia output device as a notebook computer and multimedia data as a live video stream as an example, the target data may be a target star in the live video stream, and while acquiring the live video stream through the multimedia interface, the notebook computer may analyze whether the image includes the target star every time it acquires a frame of image, and if the image includes the target star, acquire a picture avatar of the target star for the image, and process the picture avatar, so that a user may view the target star and the picture avatar at the same time.

According to the embodiment of the disclosure, when processing the identification data, the identification data can be manually associated with the target data, the multimedia output device can also automatically associate the identification data with the target data, and further, the user can also manually switch the labeling mode.

According to the embodiment of the disclosure, when processing the identification data, in the case of containing a plurality of target data, the identification data of each target data may be acquired respectively, and the identification data is associated with the corresponding target data, so that the target data can sense the corresponding identification data in an associated manner when being sensed.

For example, in a basketball game, players A, B and C may need to be labeled separately. In the case where players A, B and C are determined from the multimedia data, the names of players A, B and C may be obtained, respectively, moving the name of player a with player a on the basketball court, moving the name of player B with player B on the basketball court, and moving the name of player C with player C on the basketball court, so that the user may see the corresponding names of A, B and C while watching players A, B and C, so that the user can pay attention to the objects he wants to know in time, improving the user experience.

According to an embodiment of the present disclosure, processing multimedia data, determining target data, and obtaining identification data corresponding to the target data, the processing the identification data includes: performing for a frame of image in the multimedia data: and processing one frame of image, if the one frame of image comprises the target data, obtaining identification data corresponding to the target data, and processing the identification data to display the target data and the identification data in a correlation mode. Performing for a next frame image of the one frame image: and processing the next frame image, if the matching degree of the next frame image and the first frame image meets the condition, determining the movement parameters of the target object in the next frame image and the first frame image, and determining the display of the identification data in the next frame image according to the movement parameters so that the target data and the identification data are still associated to be displayed.

According to the embodiment of the disclosure, when one frame of image in multimedia data is processed, if the one frame of image includes target data, identification data corresponding to the target data is obtained, and the identification data is processed to enable the target data and the identification data to be associated to display. The processing mode of the next frame image of the image including the target data can be processed on the basis of the previous frame image, and further, the display state of each part of the display unit at the next time can be changed on the basis of the display state of each part of the display unit at the previous time.

For example, if a first frame image including a target object is analyzed and it can be considered that two frame images match in a case where a scene of a next frame image and the first frame image is not changed, the target object moves in the display unit and the identifier may move along with the target object.

According to the embodiment of the present disclosure, the display position of the identification data in the next frame image may be adjusted according to the movement parameter, for example, the difference between two frames of images may be determined by detecting the movement parameter (including but not limited to acceleration, motion orientation, etc.) of the target object, thereby completing the adjustment of the display state of the identification data on the display unit. Specifically, for example, during a football match, during the process of holding the ball and killing the goal, the moving direction, acceleration and other parameters of the plum can be detected, the position of the plum in the next frame image can be determined, and the identification data can be displayed on the corresponding position of the plum, so that the plum and the identification data are still associated with each other.

Through the embodiment of the disclosure, the processing efficiency of the device can be improved, and the calculation amount of the device can be saved.

Fig. 5 schematically shows a block diagram of a processing system according to an embodiment of the disclosure.

As shown in fig. 5, the processing system 400 includes a first acquisition module 410, a first processing module 420, a second acquisition module 430, and a second processing module 440.

The first obtaining module 410 is used for obtaining multimedia data, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface.

The first processing module 420 is configured to process the multimedia data to determine target data. Wherein the target data is data that can be perceived when output by the multimedia output device.

The second obtaining module 430 is configured to obtain identification data corresponding to the target data.

The second processing module 440 is configured to process the identification data such that the identification data can be associated with a perception when the target data is perceived.

Fig. 6 schematically shows a block diagram of a first processing module according to an embodiment of the disclosure.

As shown in fig. 6, the first processing module 420 includes a first obtaining unit 421 and a first matching unit 422.

The first obtaining unit 421 is configured to obtain first input information corresponding to an operation.

The first matching unit 422 is configured to match the first input information with the multimedia data to determine the target data.

Fig. 7 schematically illustrates a block diagram of a first processing module according to another embodiment of the present disclosure.

As shown in fig. 7, the first processing module 420 includes a second obtaining unit 423, a third obtaining unit 424, and a second matching unit 425.

The second obtaining unit 423 is configured to obtain second input information corresponding to the operation.

The third obtaining unit 424 is configured to obtain the extension information having an association relationship with the second input information.

The second matching unit 425 is configured to match the second input information and the extension information with the multimedia data to determine the target data.

According to an embodiment of the present disclosure, wherein: the first processing module 420 is further configured to perform, for each frame of image in the multimedia data, processing one frame of image; the second obtaining module 430 is further configured to obtain identification data corresponding to the target data if the frame of image includes the target data; the second processing module 440 is also for processing the identification data such that the identification data can be associated with a perception when the target data is perceived.

According to an embodiment of the present disclosure, wherein: the first processing module 420 is further configured to execute, for one frame of image in the multimedia data, processing one frame of image; the second obtaining module 430 is further configured to obtain identification data corresponding to the target data if the frame of image includes the target data; the second processing module 440 is further configured to process the identification data such that the target data is displayed in association with the identification data; the first processing module 420 is further configured to perform, for a next frame image of the one frame image, processing the next frame image, and if the matching degree between the next frame image and the one frame image satisfies a condition, determining a movement parameter of the target object in the next frame image and the one frame image; the second processing module 440 is further configured to determine the display of the identification data in the next frame of image according to the movement parameter, such that the target data and the identification data are still associated with each other.

Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging a circuit, or in any one of or a suitable combination of software, hardware, and firmware implementations. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.

For example, any plurality of the first obtaining module 410, the first processing module 420, the second obtaining module 430, and the second processing module 440 may be combined and implemented in one module, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first obtaining module 410, the first processing module 420, the second obtaining module 430, and the second processing module 440 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or any suitable combination of any of them. Alternatively, at least one of the first obtaining module 410, the first processing module 420, the second obtaining module 430 and the second processing module 440 may be at least partially implemented as a computer program module, which when executed, may perform a corresponding function.

The multimedia output device of the present disclosure includes a multimedia interface; a processor; a memory for storing one or more programs, wherein the one or more programs, when executed by the processor, cause the processor to implement the processing method of the present disclosure.

Fig. 8 schematically shows a block diagram of a multimedia output device adapted to implement the above described method according to an embodiment of the present disclosure. The multimedia output apparatus shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 8, the multimedia output apparatus 500 includes a processor 510, a memory 520, and a multimedia interface 530. The multimedia output apparatus 500 may perform a method according to an embodiment of the present disclosure.

In particular, processor 510 may include, for example, a general purpose microprocessor, an instruction set processor and/or related chip set and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), and/or the like. The processor 510 may also include on-board memory for caching purposes. Processor 510 may be a single processing unit or a plurality of processing units for performing different actions of a method flow according to embodiments of the disclosure.

Memory 520, for example, may be a non-volatile memory, specific examples including, but not limited to: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and so on.

The memory 520 may include a computer program 521, which computer program 521 may include code/computer-executable instructions that, when executed by the processor 510, cause the processor 510 to perform a method according to an embodiment of the disclosure, or any variation thereof.

The computer program 521 may be configured with, for example, computer program code comprising computer program modules. For example, in an example embodiment, code in computer program 521 may include one or more program modules, including for example 521A, modules 521B, … …. It should be noted that the division and number of modules are not fixed, and those skilled in the art may use suitable program modules or program module combinations according to actual situations, and when these program modules are executed by the processor 510, the processor 510 may execute the method according to the embodiment of the present disclosure or any variation thereof.

According to an embodiment of the present disclosure, at least one of the first obtaining module 410, the first processing module 420, the second obtaining module 430 and the second processing module 440 may be implemented as a computer program module described with reference to fig. 5, which, when executed by the processor 510, may implement the respective operations described above.

The present disclosure also provides a memory, which may be included in the device/apparatus/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The memory carries one or more programs that, when executed, implement methods in accordance with embodiments of the disclosure.

According to embodiments of the present disclosure, the memory may be a non-volatile memory, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, the memory may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.

Claims

1. A method of processing, comprising:

obtaining multimedia data, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface;

the multimedia output device processes the multimedia data and determines target data in the process of outputting the data; wherein the target data is data that can be perceived when output by a multimedia output device;

obtaining identification data corresponding to the target data;

processing the identification data such that the identification data can be associated with a perception that the target data is perceived;

wherein processing the multimedia data and determining the target data comprises:

acquiring first input information corresponding to the operation;

matching the first input information with the multimedia data to determine the target data; or the like, or, alternatively,

processing the multimedia data, wherein the determining the target data comprises:

acquiring second input information corresponding to the operation;

acquiring extended information having an association relation with the second input information;

and matching the second input information and the extension information with the multimedia data to determine the target data.

2. The method of claim 1, wherein the multimedia data is processed, target data is determined; obtaining identification data corresponding to the target data; processing the identification data includes:

respectively executing the following steps for each frame of image in the multimedia data: processing a frame of image; if the frame of image includes target data, identification data corresponding to the target data is obtained, and the identification data is processed so that the identification data can be associated with perception when the target data is perceived.

3. The method of claim 1, wherein the multimedia data is processed, target data is determined; obtaining identification data corresponding to the target data; processing the identification data includes:

performing, for a frame of image in the multimedia data: processing a frame of image; if the frame of image comprises target data, acquiring identification data corresponding to the target data, and processing the identification data to enable the target data to be displayed in association with the identification data;

performing, for a next frame image of the one frame image: and processing the next frame image, if the matching degree of the next frame image and the one frame image meets the condition, determining the movement parameters of the target object in the next frame image and the one frame image, and determining the display of the identification data in the next frame image according to the movement parameters so that the target data and the identification data are still associated to be displayed.

4. The method of claim 1, wherein:

the perceived mode of the target data is the same as or different from the perceived mode of the identification data corresponding to the target data.

5. A processing system, comprising:

the first acquisition module is used for acquiring multimedia data, wherein the multimedia data is data transmitted to a multimedia output device through a multimedia interface;

the first processing module is used for processing the multimedia data and determining target data in the process of outputting the data by the multimedia output device; wherein the target data is data that can be perceived when output by a multimedia output device;

the second acquisition module is used for acquiring identification data corresponding to the target data;

a second processing module for processing the identification data such that the identification data can be associated with a perception when the target data is perceived;

wherein the first processing module comprises:

the first acquisition unit is used for acquiring first input information corresponding to the operation;

the first matching unit is used for matching the first input information with the multimedia data so as to determine the target data;

the second acquisition unit is used for acquiring second input information corresponding to the operation;

a third acquisition unit configured to acquire extended information having an association relationship with the second input information;

and the second matching unit is used for matching the second input information and the extension information with the multimedia data so as to determine the target data.

6. A multimedia output apparatus comprising:

a multimedia interface;

a processor;

a memory for storing one or more programs, wherein the one or more programs, when executed by the processor, cause the processor to implement the processing method of any of claims 1 to 4.

7. A memory having one or more programs stored thereon, which when executed by a processor, cause the processor to implement the method of any of claims 1 to 4.