CN115474083B - Multi-channel audio and video synchronous live broadcast method and system - Google Patents
Multi-channel audio and video synchronous live broadcast method and system Download PDFInfo
- Publication number
- CN115474083B CN115474083B CN202211363605.8A CN202211363605A CN115474083B CN 115474083 B CN115474083 B CN 115474083B CN 202211363605 A CN202211363605 A CN 202211363605A CN 115474083 B CN115474083 B CN 115474083B
- Authority
- CN
- China
- Prior art keywords
- audio
- video
- path
- slice
- files
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000001360 synchronised effect Effects 0.000 title claims abstract description 34
- 238000006243 chemical reaction Methods 0.000 claims abstract description 8
- 230000008569 process Effects 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 8
- 230000002159 abnormal effect Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
 
- 
        - H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
 
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The application relates to a multi-channel audio and video synchronous live broadcasting method, wherein the method comprises the following steps: the server side collects multiple paths of audio and video, carries out format conversion on each path of audio and video respectively, generates slice files and index files corresponding to each path of audio and video, extracts audio time stamps and serial numbers of each slice file and writes the audio time stamps and the serial numbers into the indexes; and acquiring an index file corresponding to each path of audio and video from the server, and respectively selecting a slice file from each path of audio and video as an initial slice according to an audio time stamp in the index file. And selecting the fastest one of the audio time stamps of the initial slices of each path as a reference time stamp, and calibrating the slice files of the audio and video of each path according to the reference time stamp to play. By the method and the device, the problem that multi-path audio and video is not synchronous due to the difference of the speed of generating the ts file by each path of video stream, the video length contained in the ts file and the like is solved, and a user can independently edit each path of audio and video pictures while multi-path audio and video notification is live.
    Description
Technical Field
      The application relates to the technical field of network live broadcast, in particular to a multi-channel audio and video synchronous live broadcast method and system.
    Background
      With the rapid development of the live broadcast industry, the innovative demand of users for watching audio and video is increased, and novel live broadcast modes such as multi-screen synchronous interaction and multi-angle synchronous live broadcast are gradually applied in life.
      In the related art, parameters such as frame rate, resolution, and bit rate of each video are different, so that the speed of a generated slice file, the length of a video included in a slice, and the like are different. For example, when the current mainstream HLS protocol is used to encapsulate an RTMP video stream into ts slice files, the difference in parameters such as frame rate, resolution, and bit rate of the multiple audio/video streams may cause the difference in the speed of generating the ts file and the length of the video contained in the ts file, and further cause the asynchronous playing start pictures and the inconsistent playing speed of the multiple videos.
      Therefore, the multi-channel audio and video cannot be calibrated in a traditional mode, even if the initial slice can be calibrated, due to the fact that the speed and the length of the slice are different, the multi-channel audio and video are more and more asynchronous along with playing, namely the situation that the frame timestamps of the multi-channel video are synchronous but the pictures are not synchronous occurs.
      At present, no effective solution is provided for the problem that the multi-channel audio and video cannot be synchronously played in the related technology.
    Disclosure of Invention
      The embodiment of the application provides a method, a system, computer equipment and a computer readable storage medium for synchronously broadcasting multi-channel audio and video, so as to at least solve the problem that the multi-channel audio and video cannot be synchronously played in the related technology.
      In a first aspect, an embodiment of the present application provides a method for synchronous live broadcasting of multiple channels of audio and video, where the method includes:
      the server side collects the multi-path audio and video,
      respectively carrying out format conversion on each path of audio and video to generate a slice file and an index file corresponding to each path of audio and video, extracting an audio time stamp and a serial number of each slice file and writing the audio time stamp and the serial number into the index file;
      the terminal sends a playing request, acquires the index files corresponding to each path of audio and video from the server,
      selecting a slice file in each path of audio and video as an initial slice according to the audio time stamp in the index file, wherein the difference value of the audio time stamp is the smallest between the initial slices selected in each path of audio and video,
      and selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, calibrating the slice files of the audio and video of each path according to the reference time stamp, and then playing.
      In some embodiments, the playing after calibrating the slice file of each audio and video according to the reference timestamp includes:
      acquiring a reference channel audio/video corresponding to the reference timestamp;
      respectively calculating the audio time stamps of the initial slice files of other paths of audio and video, and obtaining the calibration time difference of other paths of audio and video by the difference value between the audio time stamps and the reference time stamps;
      and calibrating the initial slices of other paths of audios and videos according to the calibration time difference, playing the initial slices, and then sequentially playing the subsequent slice files of the paths of audios and videos according to the slice serial numbers.
      In some embodiments, after calculating the audio time stamp of the start slice file of each of the other audio and video, and the difference value from the reference time stamp, the method further includes:
      acquiring the loading time of each path of audio and video on the terminal, and calculating the difference value between the loading time of other paths of audio and video and the loading time of the reference path of audio and video;
      obtaining calibration time differences of other paths of audios and videos based on the loading time difference and the difference of the reference timestamp;
      and calibrating the initial slices of other paths of audios and videos according to the calibration time difference, playing the initial slices, and then sequentially playing the subsequent slice files of the paths of audios and videos according to the slice serial numbers.
      In some embodiments, calibrating the start slice of each of the other audio/video channels according to the calibration time difference includes:
      and fast forwarding the starting slices of the other paths of audios and videos to the time progress which is the same as the reference time stamp according to the calibration time difference.
      In some embodiments, after sequentially playing the subsequent slice files of the audio and video, the method further includes:
      acquiring the accumulated playing time of each path of video in real time;
      and detecting whether the multi-channel audio and video is synchronously played according to the accumulated playing time, if not, determining an audio time stamp of an initial slice of the one channel of the fastest-played audio and video picture, and taking the audio time stamp as a reference time stamp to recalibrate other channels of pictures.
      In some embodiments, in the process of downloading the slice file of each audio and video in the index file in real time, and calibrating the slice file of each audio and video according to the reference timestamp and then playing the slice file, the method further includes:
      the terminal receives a user-defined operation instruction,
      according to the self-defined operation instruction, respectively editing and processing the pictures of the audio and video of each channel on a front-end display interface,
      wherein, the self-defining operation instruction comprises: a picture scaling instruction, a picture moving instruction, a picture floating instruction, and a picture deleting instruction.
      In some embodiments, the multiple paths of audio and video are RTMP stream files, and the server converts the RTMP stream files of each path of audio and video into HLS files respectively to obtain multiple sets of slice files and an index file.
      In a second aspect, an embodiment of the present application provides a multi-channel audio and video synchronization live system, where the system includes: a server and a terminal, wherein;
      the server is used for acquiring multiple paths of audio and video, respectively performing format conversion on each path of audio and video, generating a slice file and an index file corresponding to each path of audio and video, extracting an audio time stamp and a serial number of each slice file and writing the audio time stamp and the serial number into the index file;
      the terminal is used for sending a playing request, acquiring the index file corresponding to each path of audio and video from the server, respectively selecting one slice file in each path of audio and video as an initial slice according to the audio time stamp in the index file, wherein the difference value of the audio time stamp is the smallest between the initial slices selected by each path of audio and video,
      and selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, calibrating the slice files of the audio and video of each path according to the reference time stamp, and then playing.
      In a third aspect, an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor, when executing the computer program, implements the method according to the first aspect.
      In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method according to the first aspect.
      Compared with the related technology, the multi-channel audio and video synchronous live broadcast method provided by the embodiment of the application at least has the following beneficial effects:
      the method is calibrated based on the audio time stamp, and can also realize good synchronous live broadcast effect on multi-channel audio and video devices with different video parameters (code rate, frame rate, resolution ratio and the like) under the scene of transmitting the multi-channel audio and video by applying the slice stream file;
      in order to overcome the problem that a multi-channel video stream slice cannot be accurately calibrated in a traditional mode, and therefore a plurality of channels of audio and video are combined into one channel of video to be sent to a terminal, in this embodiment, a plurality of independent video streams are received by a user, the user can still independently edit each channel of audio and video pictures according to the self requirements while experiencing synchronous playing, and the original video image quality can still be maintained.
    Drawings
      The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
      fig. 1 is a schematic diagram of an application environment of a multi-channel audio and video synchronous live broadcasting method according to an embodiment of the application;
      fig. 2 is a flowchart of a multi-channel audio and video synchronous live broadcasting method according to an embodiment of the present application;
      fig. 3 is a flowchart after calibrating slice files of individual audios and videos according to a reference timestamp in an embodiment of the present application;
      fig. 4 is a block diagram of a multi-channel audio and video synchronous live system according to an embodiment of the present application;
      fig. 5 is a schematic diagram of an internal structure of an electronic device according to an embodiment of the present application.
    Detailed Description
      In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
      It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
      Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly and implicitly understood by one of ordinary skill in the art that the embodiments described herein may be combined with other embodiments without conflict.
      Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
      The multi-channel audio and video live broadcast interaction method provided by the application body can be applied to an application environment shown in fig. 1, fig. 1 is an application environment schematic diagram of a multi-channel audio and video synchronous live broadcast method according to the embodiment of the application, as shown in fig. 1, a server  10 collects multi-channel audio and video, packages each channel of audio and video into a plurality of groups of specific slice files respectively, and adds an audio time stamp and a serial number to an index of each slice file. Further, the terminal 11 acquires each channel of audio and video from the server by sending a play request, and calibrates each channel of audio and video based on the audio time stamp in the index file to realize synchronous play.
      It should be noted that the server  10 may be a physical server or a distributed cloud server, and the terminal 11 includes, but is not limited to, an internet terminal such as a smart phone, a tablet computer, and a notebook computer, and may implement a multi-channel audio and video synchronous live broadcast through independent video playing software or a web page terminal.
      The application provides a method for multi-channel audio and video synchronous live broadcast, fig. 2 is a flow chart of the method for multi-channel audio and video synchronous live broadcast according to the embodiment of the application, and as shown in fig. 2, the flow includes the following steps:
      s201, a server side collects multiple paths of audios and videos;
      the server side collects multiple paths of audio and video through a wireless network, and the multiple paths of audio and video can be videos of the same object shot from different angles in the same time period, such as videos of the same anchor shot from different angles, videos of the same building or landscape shot from different angles, and the like.
      Further, the server may be a distributed file server, on which a Content Delivery Network (CDN) is built, and the CDN is composed of edge node server groups distributed in different areas.
      The CDN network has a wide application range, and supports acceleration of contents in various industries and scenes, for example: small picture files, large file downloads, video and audio on demand, live streaming media, total station acceleration, security acceleration and the like. In this embodiment, the storage and management of multiple paths of audio and video are realized based on a CDN network.
      S202, respectively carrying out format conversion on each path of audio and video, generating a slice file and an index file corresponding to each path of audio and video, extracting an audio time stamp and a serial number of each slice file, and writing the audio time stamp and the serial number into the index file;
      the multi-channel audio/video can be in various formats, and the file obtained by converting the multi-channel audio/video can also be a file based on various protocols, specifically:
      the audio/video file may be a file based on an RTMP (Real Time Messaging Protocol) or an audio/video file of another Protocol; further, the audio/video file can be converted into an audio/video file based on an HLS protocol or a DASH protocol, and the index file includes, but is not limited to, an m3u8 file, a txt file, and the like;
      it should be noted that, although the specific file protocols adopted are different, the processing manners may also have specific differences, but it should be understood that what manner is adopted to convert the original video into a plurality of slice stream files and index files, and the manner in which each index file is associated with these slice stream files, should fall within the scope of the claims of the present application.
      In this embodiment, since the RTMP protocol and the HLS protocol are mainstream live broadcast protocols in the field, and are easier for CDN network distribution and have better penetrability, in this embodiment, the acquired audio/video file is preferably an RTMP stream file, and the format conversion is preferably a file that converts the audio/video file into the HLS protocol, which is specifically exemplified as follows:
      converting RTMP streaming audio and video into HLS files at a server side to obtain an index file of m3u (8), a plurality of ts media slice files and a key encryption string file, and further writing an audio time stamp of each ts media slice into the index file;
      specifically, how many ts media slices can be generated by each audio/video, the speed of each slice and the video duration contained in the slice are determined by specific parameters (such as frame rate, code rate and the like) of an audio/video file, and each slice has an independent serial number;
      the M3U8 file is an M3U file in a UTF-8 coding format, wherein an index plain text file is input, and when the terminal opens the index plain text file, the playing software does not play the index plain text file, but finds the network address of the corresponding audio/video file according to the index of the index plain text file for online playing;
      further, in this embodiment, a layer of tag information is additionally added to the index file, that is, for each audio/video file, an audio timestamp and a serial number of each slice file obtained by encapsulating the audio timestamp and the serial number are added to the index file for subsequent calibration.
      S203, the terminal sends a playing request, acquires index files corresponding to each path of audio and video from the server, and selects one slice file in each path of audio and video as an initial slice according to audio time stamps in the index files, wherein the difference value of the audio time stamps among the initial slices selected by each path of audio and video is the minimum;
      it should be noted that, an audio/video is split into multiple slice files, assuming that the duration of each slice file is approximately 2S, the corresponding slice file in the index file of each audio/video dynamically changes as the live broadcast progresses, but the total number of the corresponding slice files in the index remains unchanged.
      As shown in table 1 below:
      | audio and video circuit | A | II | III | 
| Serial number of slice | ① | ② | ⑤ | 
| Time stamp scale | 1S | 3S | 7S | 
If the existing three paths of audio and video participate in synchronous live broadcasting, the slice size of each path of audio and video is 2S. If the initial slices selected from each path of audio and video are sequentially as follows: one path (slice (1)), two paths (slice (2)), and three paths (slice (5)), and at this time, the time stamp scales of the audio and video of each path are 1S, 3S, and 7S, respectively.
      If the multi-path step is to be realized, slices of the 1 path and the 2 paths need to be fast-forwarded to the 7 th S, however, due to network limitation, data of (5) slices may not be downloaded yet in the video of the 1 path at this time, and it is inevitable that the first path of audio and video cannot be fast-forwarded directly to the 7 th S video picture corresponding to the 5 th slice, and if the fast-forwarding is continuously executed, abnormal situations such as picture pause or reverse play will be caused.
      In the embodiment of the present application, since the start slices determined from each way are determined according to the audio time stamps, and the difference between the selected start slices of each way is the smallest. Therefore, the time difference of the initial slices in different paths can be ensured not to exceed the time length of one slice, correspondingly, in the subsequent synchronous processing, the audio and video of different paths are limited to be fast-forwarded on the same slice, and therefore the problems can be effectively avoided.
      Further, it should be noted that if the start slice is selected according to the serial number, the speed and length of generating the slice file are different due to different specific parameters of each path of video, and after the live broadcast is performed for a long time, due to the difference between the accumulated slices, even if the slice with the same serial number is cut, the real time scales of different paths of audio and video may be greatly different, for example, in the first path of audio and video, the time stamp of the slice 800 is 1600 seconds, but in the third path of audio and video, the time stamp of the slice 800 with the same serial number may fluctuate to 1610 seconds; in this case, if the slice 800 is selected as the starting slice for the three paths of audios and videos to perform calibration, the problem of fast forwarding abnormality caused by the fact that corresponding data is not downloaded yet still inevitably occurs.
      Compared with the traditional mode, in the embodiment, the initial slice is obtained by judging the difference value of the audio time stamps between each path of audio and video, so that the condition that the calibration is abnormal because the subsequent data is not downloaded in the subsequent calibration process can be avoided, the stability in the synchronization process is further ensured, the synchronization effect is improved,
      And S204, selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, calibrating the slice files of the audio and video of each path according to the reference time stamp, and then playing.
      It should be noted that, because there are various parameter differences such as frame rate, code rate, resolution, etc. among multiple channels of video, when a conventional timestamp is used as a synchronization reference parameter, even if the initial picture is played synchronously, the initial picture will be more and more asynchronous with the playing;
      in the embodiment, the audio time frame is written into the index file, a proper initial slice is dynamically selected from the current index file according to the live broadcast progress, and the synchronization reference parameter based on the audio is further determined.
      Through the steps S201 to S203, compared with the multi-channel audio and video synchronization live broadcast method in the related art, after the client acquires the indexes of different channels of audio and video, the client can select the start slice according to the audio time stamp therein, and then realize multi-channel synchronization by taking the fastest time stamp in the start slice as a reference. Therefore, a good synchronization effect can be still obtained for the multi-channel audio and video streams with different video parameters.
      Compared with the prior art that multiple paths of audio and video are combined into one path of audio and video and the audio and video is sent to the terminal, the method and the device have the advantages that based on the relation between the slices and the indexes, synchronous live broadcast is guaranteed, meanwhile, the terminal can still independently receive the audio and video of different paths, accordingly, a user can be supported to independently adjust the size, the interface position, the picture-in-picture and the like of each path of audio and video picture so as to meet different watching requirements, and the video acquired by the terminal can still keep the original image quality.
      In some embodiments, fig. 3 is a flowchart after calibrating slice files of each audio and video according to a reference timestamp in an embodiment of the present application, and as shown in fig. 3, the flowchart includes the following steps:
      s301, recording a corresponding path of audio and video as a reference path of audio and video of the initial slice where the reference timestamp is located;
      s302, respectively calculating the audio time stamps of the initial slice files of other audio and video channels, and obtaining the calibration time difference of the audio and video channels according to the difference value between the audio time stamps and the reference time stamps; it can be understood that other audio and video channels are audio and video channels except the reference audio and video channel;
      and S303, according to the calibration time difference, firstly calibrating the initial slice of each path of audio and video, playing the initial slice, and secondly, sequentially playing the subsequent slice files of each path of audio and video according to the slice serial number.
      The process of calibrating the real slices of each path of audio and video is equivalent to fast forwarding one or more paths of slices with slower time stamps to the same playing time point of the path of audio and video with the fastest progress according to the time stamp difference.
      As shown in table 2 below:
      | audio and video circuit | A | II | III | 
| Serial number of slice | ① | ① | ① | 
| Time stamp scale | 1S | 1.2S | 1.6S | 
The timestamps of the selected initial slices in the three paths of audio and video are respectively 1s, 1.2s and 1.6s, under the condition, the pictures of the first path and the second path are fast forwarded to 1.6s, and multi-path synchronization is realized.
      In some embodiments, due to network fluctuation interference, the loading time of multiple audio/video frames may also be different, and the difference in loading time may further cause asynchronous playing, so in order to further improve the stability of synchronous live broadcasting, in the embodiment of the present application, in the process of calculating the calibration time difference of each audio/video, the following steps are further included:
      acquiring the loading time of each path of audio and video on a terminal, and calculating the difference value between the loading time of other paths of audio and video and the loading time of a reference path of audio and video;
      obtaining calibration time differences of other paths of audio and video according to the difference value of the loading time difference value and the reference timestamp;
      and calibrating the initial slices of the audios and videos of each path according to the calibration time difference.
      Specifically, as shown in table 3 below:
      | audio and video circuit | A | II | III | 
| Audio time stamping | 2.1S | 2.2S | 2.3S | 
| Load time | 0.2S | 0.5S | 0.3S | 
With reference to table 3, it can be seen that, due to different loading times, the first path of video can only play the 2.1s picture after 0.2s, the second path of video can only play the 2.2s content after 0.5s, and the third path of video can only play the 2.3s picture after 0.3 s. Correspondingly, by analogy, the first way will play the 2.4s content after 0.5s, and the third way will play the 2.5s content after 0.5 s.
      Therefore, after the audio time stamp difference and the loading time difference are comprehensively considered, the third path is determined to be the reference path audio and video, the reference time stamp is 2.5s, the calibration time differences of the first path and the second path of audio and video are respectively 0.1s and 0.3s, correspondingly, the first path of audio and video and the second path of audio and video are required to be fast forwarded for 0.1s and 0.3s to 2.5s of the third path respectively, so that three-path synchronization is realized.
      In some embodiments, due to abnormal conditions such as multi-channel video streaming, the phenomenon of live broadcast asynchronization caused by interference of abnormal reasons still can occur in the process of realizing synchronous playing;
      therefore, in this embodiment, the playing time of each path of audio and video is continuously recorded during the playing process, so as to monitor the playing process, and when it is detected that one or more paths of playing pictures are not synchronized with other paths, the audio timestamp of the path of audio and video picture which is played fastest is determined, and is used as the reference timestamp to recalibrate the other paths of pictures.
      Specifically, how to perform the calibration is similar to the above steps S301 to S303, and is not described herein again.
      In some embodiments, because the multiple paths of audio and videos which are still independent from each other are received by the user terminal in the present application, the user can be supported to independently edit and adjust each path of audio and video, so as to meet the diversified requirements of the user, and the user experience is improved, specifically including:
      the terminal receives a user-defined operation instruction, and edits and processes the pictures of each path of audio and video on a front-end display interface according to the user-defined operation instruction, wherein the user-defined operation instruction comprises but is not limited to: a picture zooming instruction, a picture moving instruction, a picture floating instruction, a picture deleting instruction and the like.
      In some embodiments, the multiple audio/video may be an RTMP stream file, or a file in another format, and further, the multiple audio/video may be converted into an HLS format, or may be converted into a DASH protocol similar to the HLS protocol.
      The embodiment also provides a multi-channel audio and video synchronous live system, which is used for implementing the above embodiments and preferred embodiments, and the description of the system is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware or a combination of software and hardware is also possible and contemplated.
      Fig. 4 is a block diagram of a structure of a multi-channel audio and video synchronous live system according to an embodiment of the present application, and as shown in fig. 4, the system includes: a server  40 and a terminal 41, wherein;
      the server  40 is used for collecting multiple paths of audio and video, respectively performing format conversion on each path of audio and video, generating a slice file and an index file corresponding to each path of audio and video, extracting an audio time stamp and a serial number of each slice file and writing the audio time stamp and the serial number into the index file;
      the terminal 41 is configured to send a play request, obtain an index file corresponding to each channel of audio and video from the server  40, select a slice file in each channel of audio and video as an initial slice according to an audio time stamp in the index file, where a difference between audio time stamps of the initial slices selected in each channel of audio and video is the smallest,
      and selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, calibrating the slice files of the audio and video of each path according to the reference time stamp, and then playing.
      In one embodiment, fig. 5 is a schematic diagram of an internal structure of an electronic device according to an embodiment of the present application, and as shown in fig. 5, an electronic device is provided, where the electronic device may be a server, and the internal structure diagram may be as shown in fig. 5. The electronic device comprises a processor, a network interface, an internal memory and a non-volatile memory connected by an internal bus, wherein the non-volatile memory stores an operating system, a computer program and a database. The processor is used for providing calculation and control capacity, the network interface is used for being connected and communicated with an external terminal through a network, the internal memory is used for providing an environment for an operating system and the running of a computer program, the computer program is executed by the processor to realize a multi-channel audio and video synchronous live broadcast method, and the database is used for storing data.
      Those skilled in the art will appreciate that the configuration shown in fig. 5 is a block diagram of only a portion of the configuration associated with the present application, and does not constitute a limitation on the electronic device to which the present application is applied, and a particular electronic device may include more or less components than those shown in the drawings, or may combine certain components, or have a different arrangement of components.
      It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
      It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.
      The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
    Claims (9)
1. A multi-channel audio and video synchronous live broadcasting method is characterized by comprising the following steps:
      the server side collects the multi-path audio and video,
      the method comprises the steps that format conversion is carried out on each path of audio and video respectively, a slice file and an index file corresponding to each path of audio and video are generated, an audio time stamp and a serial number of each slice file are extracted and written into the index file, wherein the multiple paths of audio and video are RTMP (real time Messaging protocol) stream files, the RTMP stream files of each path of audio and video are converted into HLS (HTTP live streaming) files by the server side respectively, multiple groups of TS (transport stream) slice files and one index file are obtained respectively, and the index files are M3U8 files;
      the terminal sends a playing request, acquires the index file corresponding to each path of audio and video from the server,
      selecting a slice file in each path of audio and video as an initial slice according to the audio time stamp in the index file, wherein the difference value of the audio time stamp is the smallest between the initial slices selected in each path of audio and video,
      and selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, obtaining the calibration time difference of the slice files of the audio and video of each path according to the reference time stamp, and playing the slice files of the audio and video of each path after synchronizing the slice files of the audio and video of each path according to the calibration time difference.
    2. The method according to claim 1, wherein the playing after calibrating the slice file of each audio and video according to the reference timestamp comprises:
      acquiring a reference channel audio/video corresponding to the reference timestamp;
      respectively calculating the audio time stamps of the initial slice files of other paths of audio and video, and obtaining the calibration time difference of other paths of audio and video by the difference value between the audio time stamps and the reference time stamps;
      and calibrating the initial slices of other paths of audios and videos according to the calibration time difference, playing the initial slices, and then sequentially playing the subsequent slice files of the paths of audios and videos according to the slice serial numbers.
    3. The method according to claim 2, wherein after calculating the audio time stamp of the start slice file of each of the other audio/video channels respectively and the difference value from the reference time stamp, the method further comprises:
      acquiring the loading time of each path of audio and video on the terminal, and calculating the difference value between the loading time of other paths of audio and video and the loading time of the reference path of audio and video;
      obtaining calibration time differences of other paths of audio and video based on the difference value of the loading time difference value and the reference timestamp;
      and calibrating the initial slices of other paths of audios and videos according to the calibration time difference, playing the initial slices, and then sequentially playing the subsequent slice files of the paths of audios and videos according to the slice serial numbers.
    4. The method of claim 3, wherein calibrating the start slice of each of the other audio/video according to the calibration time difference comprises:
      and fast forwarding the starting slices of the other paths of audios and videos to the time progress same as the reference time stamp according to the calibration time difference.
    5. The method of claim 2, wherein after sequentially playing subsequent slices of the respective videos and audios, the method further comprises:
      acquiring the accumulated playing time of each path of video in real time;
      and detecting whether the multi-channel audio and video is synchronously played according to the accumulated playing time, if not, acquiring the playing time of the one channel of audio and video picture which is played fastest, and taking the playing time as a reference timestamp to recalibrate other channels of pictures.
    6. The method according to claim 1, wherein in the process of downloading the slice files of each channel of audio and video in the index file in real time, and calibrating the slice files of each channel of audio and video according to the reference timestamp and then playing, the method further comprises:
      the terminal receives a user-defined operation instruction,
      according to the self-defined operation instruction, respectively editing and processing the pictures of the various audio and video on a front-end display interface,
      wherein, the self-defining operation instruction comprises: a picture scaling instruction, a picture moving instruction, a picture floating instruction, and a picture deleting instruction.
    7. A multi-channel audio and video synchronous live system is characterized by comprising: a server and a terminal, wherein;
      the server is used for collecting multiple paths of audio and video, performing format conversion on the multiple paths of audio and video respectively, generating slice files and index files corresponding to the multiple paths of audio and video, extracting audio time stamps and serial numbers of all the slice files and writing the audio time stamps and the serial numbers into the index files, wherein the multiple paths of audio and video are RTMP (real time Messaging protocol) stream files, the server converts the RTMP stream files of the multiple paths of audio and video into HLS (HTTP live streaming) files respectively to obtain multiple groups of TS (transport stream) slice files and one index file, and the index files are M3U8 files;
      the terminal is used for sending a playing request, acquiring the index file corresponding to each path of audio and video from the server, respectively selecting one slice file in each path of audio and video as an initial slice according to the audio time stamp in the index file, wherein the difference value of the audio time stamp is the smallest between the initial slices selected by each path of audio and video,
      and selecting one of the audio time stamps of the initial slices of each path as a reference time stamp, downloading the slice files of the audio and video of each path in the index file in real time, obtaining the calibration time difference of the slice files of the audio and video of each path according to the reference time stamp, and playing the slice files of the audio and video of each path after synchronizing the slice files of the audio and video of each path according to the calibration time difference.
    8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 6 when executing the computer program.
    9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
    Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202211363605.8A CN115474083B (en) | 2022-11-02 | 2022-11-02 | Multi-channel audio and video synchronous live broadcast method and system | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202211363605.8A CN115474083B (en) | 2022-11-02 | 2022-11-02 | Multi-channel audio and video synchronous live broadcast method and system | 
Publications (2)
| Publication Number | Publication Date | 
|---|---|
| CN115474083A CN115474083A (en) | 2022-12-13 | 
| CN115474083B true CN115474083B (en) | 2023-03-14 | 
Family
ID=84337032
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202211363605.8A Active CN115474083B (en) | 2022-11-02 | 2022-11-02 | Multi-channel audio and video synchronous live broadcast method and system | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN115474083B (en) | 
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN116527981A (en) * | 2023-05-30 | 2023-08-01 | 中国电信股份有限公司 | Multipath video synchronization method, system, device and storage medium | 
| CN119676484A (en) * | 2024-12-24 | 2025-03-21 | 天翼视联科技有限公司 | A multi-view video processing method and related device | 
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| JP2007274019A (en) * | 2006-03-30 | 2007-10-18 | Matsushita Electric Ind Co Ltd | Digital information distribution system and method | 
| EP3627798A1 (en) * | 2010-01-27 | 2020-03-25 | Koninklijke KPN N.V. | Method, system and device for synchronization of media streams | 
| CN107211078B (en) * | 2015-01-23 | 2020-07-31 | 瑞典爱立信有限公司 | V L C-based video frame synchronization | 
| CN108632681B (en) * | 2017-03-21 | 2020-04-03 | 华为技术有限公司 | Method, server and terminal for playing media stream | 
| WO2020069594A1 (en) * | 2018-10-03 | 2020-04-09 | Videolocalize Inc. | Piecewise hybrid video and audio synchronization | 
| CN111585684B (en) * | 2020-05-14 | 2021-08-10 | 武汉大学 | Multi-path camera time alignment method and system for networked monitoring video analysis | 
| CN114079813A (en) * | 2020-08-18 | 2022-02-22 | 中兴通讯股份有限公司 | Picture synchronization method, encoding method, video playback device and video encoding device | 
| CN111954064B (en) * | 2020-08-31 | 2022-11-04 | 三星电子(中国)研发中心 | Audio and video synchronization method and device | 
| CN112995708A (en) * | 2021-04-21 | 2021-06-18 | 湖南快乐阳光互动娱乐传媒有限公司 | Multi-video synchronization method and device | 
| CN113225597B (en) * | 2021-04-30 | 2022-05-17 | 北京凯视达信息技术有限公司 | Method for synchronously playing multi-channel audio and video in network transmission | 
| CN113312119B (en) * | 2021-06-04 | 2024-03-15 | 广州博冠信息科技有限公司 | Information synchronization method and device, computer readable storage medium and electronic equipment | 
| CN113840166A (en) * | 2021-08-31 | 2021-12-24 | 南京巨鲨显示科技有限公司 | A method and system for mixing and synchronizing multi-channel streaming media audio and video | 
- 
        2022
        - 2022-11-02 CN CN202211363605.8A patent/CN115474083B/en active Active
 
Non-Patent Citations (1)
| Title | 
|---|
| 王英兰 ; 刘晓强 ; 李柏岩 ; 宋晖 ; 陶抒青 ; 蔡立志 ; 刘振宇 ; .一种面向互联网应用的多路实时流媒体同步合成方案.2018,(01),全文. * | 
Also Published As
| Publication number | Publication date | 
|---|---|
| CN115474083A (en) | 2022-12-13 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US11653042B2 (en) | Apparatus and method for configuring a control message in a broadcast system | |
| CN115474083B (en) | Multi-channel audio and video synchronous live broadcast method and system | |
| CN110933449B (en) | Method, system and device for synchronizing external data and video pictures | |
| US20170127147A1 (en) | Multicast streaming | |
| US20140297804A1 (en) | Control of multimedia content streaming through client-server interactions | |
| CN113225598A (en) | Method, device and equipment for synchronizing audio and video of mobile terminal and storage medium | |
| US20200221161A1 (en) | Reception apparatus, transmission apparatus, and data processing method | |
| US12058387B2 (en) | Video processing method and apparatus, computer device, and storage medium | |
| RU2656093C2 (en) | Content supply device, content supply method, program, terminal device and content supply system | |
| US11336957B2 (en) | Reception apparatus, transmission apparatus, and data processing method | |
| EP3125563A1 (en) | Transmission device, transmission method, reception device, reception method, and program | |
| EP3048802A1 (en) | Content provision device, content provision method, program, terminal device and content provision system | |
| US20170055006A1 (en) | Receiver, transmitter, data communication method, and data processing method | |
| US10425689B2 (en) | Reception apparatus, transmission apparatus, and data processing method | |
| US11368730B2 (en) | Apparatus and method for transmitting broadcast content based on ATSC 3.0, and apparatus and method for receiving broadcast content based ATSC 3.0 | |
| CA2925455C (en) | Content supplying apparatus, content supplying method, program, terminal device, and content supplying system | |
| EP3041242B1 (en) | Content provision device, content provision method, program, terminal device, and content provision system | |
| US12184914B2 (en) | System and method for synchronizing live video with external data sources | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |