[go: up one dir, main page]

WO2018171567A1 - Procédé, serveur et terminal pour lire un flux multimédia - Google Patents

Procédé, serveur et terminal pour lire un flux multimédia Download PDF

Info

Publication number
WO2018171567A1
WO2018171567A1 PCT/CN2018/079520 CN2018079520W WO2018171567A1 WO 2018171567 A1 WO2018171567 A1 WO 2018171567A1 CN 2018079520 W CN2018079520 W CN 2018079520W WO 2018171567 A1 WO2018171567 A1 WO 2018171567A1
Authority
WO
WIPO (PCT)
Prior art keywords
video stream
terminal
server
stream
fragment
Prior art date
Application number
PCT/CN2018/079520
Other languages
English (en)
Chinese (zh)
Inventor
杨生飞
王赵淮
王伟
姜立科
曹阳
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018171567A1 publication Critical patent/WO2018171567A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/64Addressing
    • H04N21/6405Multicasting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6106Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
    • H04N21/6125Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via Internet

Definitions

  • the present application relates to the field of communications, and in particular, to a method, a server, and a terminal for playing a media stream.
  • IPTV Internet Protocol Television
  • a user can watch a television program using a terminal such as a television, a mobile phone, or a tablet.
  • a terminal such as a television, a mobile phone, or a tablet.
  • IPTV Internet Protocol Television
  • the other channel is referred to as a target channel, and the terminal can obtain the media stream of the target channel for playing by the following manner.
  • the terminal can switch to the target channel
  • the scene defines that the video stream of each channel includes two types of high-rate video stream and low-rate video stream, and fast channel switching.
  • FCC Fast Channel Change
  • the server caches the high-rate video stream and the low-rate video stream of each channel sent by the video source, so that when the terminal switches to the target channel, the FCC server can select a video stream according to requirements. Provided to the terminal for playback.
  • the handover process is as follows: after receiving the play request sent by the terminal for switching to the target channel, in order to enable the terminal to quickly play the TV program of the target channel, the FCC server first sends the low-rate video stream of the target channel to the terminal, and sends a segment. After the time, in order to improve the quality of the playback picture of the terminal, the high-rate video stream of the target channel is sent to the terminal and the terminal is notified to join the multicast group of the target channel. The terminal plays the high-rate video stream sent by the FCC server, and joins the multicast group of the target channel, and plays the audio stream and the video stream after receiving the audio stream and the video stream sent by the multicast group.
  • the FCC server sends the video stream to the terminal, and the audio stream is not involved, which may cause the terminal to play the sound or the played sound is inconsistent with the video picture.
  • the embodiment of the present application provides a method, a server, and a terminal for playing the media stream.
  • the technical solution is as follows:
  • a first aspect provides a method for playing a media stream, the method comprising: receiving, by a server, an acquisition request sent by a terminal, where the acquisition request carries a fragmentation identifier of a video stream fragment in a video stream that has been sent to the terminal;
  • the server determines, according to the fragment identifier, a start time of the video stream that has been sent to the terminal, and obtains an audio stream corresponding to the video stream according to the start time; the server sends the audio stream to the terminal, so that the terminal plays the video. Stream and the audio stream.
  • the server After receiving the acquisition request sent by the terminal, the server can determine the audio stream corresponding to the video stream that has been sent to the terminal according to the fragment identifier carried in the acquisition request, and the audio stream has the same start time as the video stream. Therefore, when the terminal simultaneously plays the video stream and the audio stream, the audio and video can be synchronized, and the problem that the terminal does not play the sound or the played sound is inconsistent with the video picture is solved in the related art.
  • the video stream sent by the server to the terminal includes description information of the at least one audio stream, so that the terminal selects, from the audio stream, the description information of the audio stream that matches the capability of the terminal, so that the terminal
  • the obtaining request sent by the server further includes description information of the one audio stream; after receiving the obtaining request, the server determines, according to the fragment identifier in the obtaining request, a start time of the video stream that has been sent to the terminal;
  • the description information of the one audio stream in the request is determined to determine an audio stream, and the audio stream corresponding to the video stream is obtained from the determined audio stream according to the start time. In this way, the terminal can acquire an audio stream that matches its own capabilities for playback according to its own capabilities.
  • the server sends a Real-time Transport Control Protocol (RTCP) packet to the terminal before receiving the acquisition request that is sent by the terminal and carrying the fragment identifier.
  • RTCP Real-time Transport Control Protocol
  • the fragment timestamp field of the packet carries the fragment identifier.
  • the server sends an RTCP packet to the terminal, where the fragment timestamp field of the RTCP packet carries the fragment identifier, so that the terminal sends an acquisition request carrying the fragment identifier to the server, where the acquisition request is used to request the server to acquire the terminal.
  • the audio stream corresponding to the received video stream.
  • the video stream that the server has sent to the terminal is the first video stream of the first code rate
  • the method further includes: stopping, by the server, sending the first video stream to the terminal and When the terminal sends the second video stream of the second code rate, the first code rate is smaller than the second code rate, and the terminal sends a notification message, where the notification message carries the sequence number of the last data packet of the first video stream and the second video stream. The sequence number of the first packet. After receiving the last data packet of the first video stream, the terminal receives the first data packet of the second video stream, and the sequence number of the last data packet is not continuous with the sequence number of the first data packet.
  • the terminal avoids the erroneous determination of the packet loss, stops playing the second data stream, and the server sends a notification message to the terminal, so that the terminal determines, according to the notification message, that the last data packet and the first data packet are consecutive data packets, so that the normal The first packet begins to play the second video stream.
  • the notification message sent by the server to the terminal is a server terminal notification (SCN) message, where the SCN message includes an old sequence number field and a new sequence number field, and the old sequence number
  • the field carries the sequence number of the last data packet of the first video stream sent by the server to the terminal
  • the new sequence number field carries the sequence number of the first data packet of the second video stream sent by the server to the terminal.
  • a second aspect provides a method for playing a media stream, the method comprising: the terminal sending an acquisition request to the server, where the acquisition request carries a fragmentation identifier of the video stream fragment in the received video stream, so that the server according to the Acquiring to send an audio stream corresponding to the video stream, the start time of the audio stream is the same as the start time of the video stream; the terminal receives the audio stream sent by the server, and plays the video stream and the audio stream it receives. Since the video stream and the start time of the audio stream are the same, the terminal can ensure the audio and video synchronization when playing the video stream and the audio stream at the same time, and solves the problem that the terminal does not play the sound or the played sound is inconsistent with the video picture in the related art. problem.
  • the terminal the video stream received by the terminal, includes the description information of the at least one audio stream, and the terminal further selects, from the description information, the description information of the audio stream that matches the capability of the first channel.
  • the acquisition request sent to the server further includes description information of the one audio stream to obtain an audio stream corresponding to its own capability from the server for playing.
  • the terminal receives the RTCP packet sent by the server, and obtains the fragment timestamp field of the RTCP packet to carry the terminal, before sending the acquisition request that carries the fragment identifier to the server.
  • the fragment identifier of the video stream fragment in the received video stream, the fragment identifier is added to the acquisition request, and after receiving the acquisition request, the server sends the terminal identifier to the terminal according to the fragment identifier.
  • the video stream corresponds to the audio stream.
  • the video stream received by the terminal is the first video stream of the first code rate
  • the terminal further receives the second video stream of the second code rate sent by the server, the second code rate.
  • the terminal further receives the notification message sent by the server, where the notification message carries two serial numbers, one of which is the serial number of the last data packet of the first video stream, and the other serial number is the second video stream.
  • the sequence number of the first data packet; the second video stream it receives is played according to the notification message.
  • the terminal After receiving the last data packet of the first video stream, the terminal receives the first data packet of the second video stream, and the sequence number of the last data packet is not continuous with the sequence number of the first data packet, to avoid the terminal.
  • the server sends a notification message to the terminal, where the notification message includes the sequence number of the last data packet and the sequence number of the first data packet.
  • the terminal determines, according to the notification message, that the last data packet and the first data packet are consecutive data packets, so that the second video stream is normally played from the first data packet.
  • the notification message sent by the server received by the terminal is an SCN message, where the SCN message includes an old sequence number field and a new sequence number field, and the old sequence number field carries the terminal received from the server.
  • the sequence number of the last data packet of the first video stream, and the new sequence number field carries the sequence number of the first data packet of the second video stream received by the terminal from the server.
  • a third aspect provides an apparatus for playing a media stream, the apparatus comprising at least one unit, the at least one unit configured to implement the play media stream provided by the foregoing first aspect or any one of the possible implementation manners of the first aspect Methods.
  • a fourth aspect provides an apparatus for playing a media stream, the apparatus comprising at least one unit, the at least one unit configured to implement the play media stream provided by any one of the foregoing second aspect or the second aspect Methods.
  • a server comprising: a processor and a network port, the processor being configured to implement the playback provided by any one of the first aspect or the first aspect of the first aspect by executing an instruction The method of media streaming.
  • a terminal comprising: a processor and a network port, the processor being configured to implement the playback provided by any one of the foregoing second aspect or the second aspect by executing an instruction The method of media streaming.
  • the seventh aspect provides a system for playing a media stream, the system comprising the server provided by the third aspect or the fifth aspect, and the terminal provided by the fourth aspect or the sixth aspect.
  • a computer readable medium storing instructions for implementing a playback media stream provided by the first aspect or any one of the possible implementations of the first aspect, or the computer
  • the readable medium stores instructions for playing a media stream provided by any one of the possible implementations of the second aspect or the second aspect.
  • FIG. 1 is a schematic structural diagram of an IPTV system according to an exemplary embodiment of the present invention.
  • FIG. 2 is a schematic diagram of video stream segmentation and audio stream segmentation according to an exemplary embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a server according to an exemplary embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a terminal according to an exemplary embodiment of the present invention.
  • FIG. 5 is a flowchart of a method for playing a media stream according to an exemplary embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of an RTCP packet according to an exemplary embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an SCN message according to an exemplary embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a server according to another exemplary embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a terminal according to another exemplary embodiment of the present invention.
  • FIG. 1 is a schematic structural diagram of an IPTV system according to an exemplary embodiment of the present invention.
  • the IPTV system includes: a source server 110, an encoding server 120, a multicast server 130, an FCC server 140, and a terminal 150.
  • the source server 110 can be a server, or a server cluster consisting of several servers, or a cloud computing service center.
  • the source server 110 includes a media stream of each of at least one channel, the media stream includes a second video stream and an audio stream corresponding to the second video stream, and the like, and may further include a subtitle stream or the like.
  • the source server 110 is connected to the encoding server 120 via a wired network or a wireless network, and the media stream of each channel can be transmitted to the encoding server 120.
  • the channel may include a television channel, a live channel, a carousel channel, and the like.
  • the media stream in the source server 110 is previously created by the technician and uploaded to the source server 110.
  • the technician creates the media stream
  • the second video stream and the at least one audio stream corresponding to the second video stream may be created, and the description information of each audio stream is set, and the created second video stream includes each path set.
  • the description information of the audio stream may include information such as a playing language, a stream identifier, and a code rate corresponding to the audio stream of the channel.
  • the technician creates two audio streams for the second video stream A, one of which is an audio stream in which the language is Chinese, and the other is The audio stream in which the language is English is played, and the created second video stream A includes description information of the two audio streams.
  • the second video stream usually produced by a technician has a higher code rate, so the resolution of the second video stream is higher.
  • the code rate of the second video stream is referred to as the second code rate.
  • the encoding server 120 receives the media stream of each channel transmitted by the source server 110.
  • the encoding server 120 uses an Adaptive Bit Rate (ABR) encoding technique to down-convert the second video stream included in the media stream of the channel to generate at least one first video stream, the at least one The first code rate of each of the first video streams in the first video stream is different from each other, and is smaller than the second code rate of the second video stream. Therefore, each channel corresponds to one second video stream, at least one first video stream, and at least one audio stream. Since each first video stream is obtained by downsampling the second video stream, the resolution of the first video stream is smaller than the resolution of the second video stream.
  • ABR Adaptive Bit Rate
  • the encoding server 120 stores at least one preset first code rate, and the second video stream may be down-coded according to the stored at least one first code rate to obtain at least one first video stream.
  • the encoding server 120 is connected to the multicast server 130 through a wired network or a wireless network, and the encoding server 120 transmits the media stream of each channel to the multicast server 130, and the media stream of each channel transmitted to the multicast server 130 includes its source.
  • the server 110 further includes at least one first video stream obtained by down-converting the second video stream.
  • the multicast server 130 can be a server, or a server cluster consisting of several servers, or a cloud computing service center.
  • the multicast server 130 is connected to the FCC server 140 and the terminal 150 via a wired network or a wireless network, respectively.
  • the multicast server 130 receives the media stream of each channel sent by the encoding server 120, and forwards the received media stream of each channel to the FCC server 140 in real time.
  • the multicast server 130 also maintains a multicast group for each channel, and transmits the media stream of the channel to the terminal 150 located in the multicast group.
  • the FCC server 140 can be a server, or a server cluster consisting of several servers, or a cloud computing service center.
  • the FCC server 140 is connected to the terminal 150 through a wired network or a wireless network.
  • the terminal 150 may include a smart TV, a set top box, a smart phone, a tablet, a smart TV, a laptop portable computer, a desktop computer, etc., and transmits a channel identifier carrying the target channel to the FCC server 140 at startup or when switching channels.
  • the play request requests the FCC server 140 to send the media stream of the target channel.
  • the FCC server 140 is configured to: after receiving the play request by the terminal 150, send, according to the channel identifier of the target channel carried by the play request, the first video stream corresponding to the target channel and the video stream included in the first video stream to the terminal 150.
  • the fragmentation identifier of the fragment is received by the receiving terminal 150, and the audio stream corresponding to the first video stream is obtained according to the fragment identifier, and the audio stream is sent to the terminal 150.
  • the terminal 150 is configured to receive the first video stream sent by the FCC server 140 and the fragment identifier, and send an acquisition request carrying the fragment identifier to the FCC server 140, and the receiving FCC server 140 sends the audio stream corresponding to the first video stream.
  • the first video stream and the audio stream are played. Since the first code rate of the first video stream is lower than the second code rate, the terminal 150 can quickly play the first video stream.
  • the FCC server 140 transmits a second video stream corresponding to the target channel to the terminal 150. Since the second code rate of the second video stream is greater than the first code rate of the first video stream, the quality of the video picture played by the terminal 150 can be improved.
  • the terminal 150 is further configured to receive the second video stream of the target channel sent by the FCC server 140 and play the second video stream, and also apply to join the multicast group corresponding to the target channel in the multicast server 130, and receive the multicast server.
  • the second video stream of the target channel is sent by 130; when the second video stream sent by the FCC server 140 is the same as the second video stream sent by the multicast server 130, the second video stream sent by the multicast server 130 is played.
  • the encoding server 120 may further slice the second video stream included in the media stream, so that the second video stream includes multiple video streams. Fragmentation, which generates a fragmentation identifier for each video stream fragment.
  • each of the first video streams obtained by the encoding server 120 down-converting the second video stream is also composed of video stream fragments.
  • Each video stream slice in the second video stream corresponds to one video stream slice in the first video stream.
  • the slice identifiers are all the same, but the resolutions of the two video stream fragments are different and the number of data packets included is different.
  • the video stream fragments located in the second video stream include more data packets than the first video stream.
  • Each video stream segment located in the second video stream includes a first frame that is a key frame for decoding the video stream, and a first frame included in each video stream segment of the first video stream is also used for decoding.
  • the data packet can be an RTP data packet.
  • the video stream fragment S11 is corresponding to the video stream fragment in the second video stream.
  • the fragment identifiers of S11 and S21 are both "1";
  • the video stream fragment S12 is corresponding to the video stream fragment in the second video stream, and the fragment identifiers of S12 and S22 are both "2";
  • the video stream is fragmented.
  • S13, in the second video stream, the corresponding video stream fragment is S23, and the fragment identifiers of S13 and S23 are both “3”; and the corresponding video stream fragment of the video stream fragment S14 in the second video stream is S24.
  • the slice identifier of S14 at S24 is "4".
  • Each video slice in the second video stream includes a greater number of data packets than the number of data packets included in the corresponding video stream segment of the video slice in a video stream.
  • the number of data packets included in the video stream segment S11 in the first video stream is 2, which is smaller than the number 4 of data packets included in the video stream segment S21.
  • the encoding server 120 may further slice the audio stream included in the media stream, and the audio stream includes a plurality of audio stream fragments, and generate a fragment identifier for the audio stream fragment.
  • Each of the first video stream and the second video stream corresponds to an audio stream fragment in the audio stream, and the video stream fragment has the same start time as its corresponding audio stream fragment.
  • the encoding server 120 fragments the first video stream, the second video stream, and the audio stream included in the media stream, and then sends the media stream to the multicast server 130.
  • the multicast server 130 may add a timestamp to each video stream segment in the first video stream included in the media stream.
  • Each video stream slice in the second video stream adds a timestamp and each audio stream slice included in the audio stream adds a timestamp.
  • the video stream fragment and the video stream fragment are fragmented in a corresponding video stream in the second video stream, and the time stamps of the two video stream fragments are the same.
  • the time stamp of the video stream fragment and the time stamp of the video stream fragment corresponding to the audio stream fragment in the audio stream are also the same.
  • the audio stream segment corresponding to the video stream segment S11 and the video stream segment S21 is S31, and the timestamps corresponding to the three of S11, S21, and S31 are the same, so the video content of S11 and S21 is the same, in S31.
  • the audio content corresponds to the video content.
  • the multicast server 130 adds timestamp information to each of the video stream segments in the first video stream, each of the second video streams, and each of the audio stream segments in the audio stream.
  • the media stream is sent to the FCC server 140 and the terminal 150.
  • FIG. 3 is a schematic structural diagram of a server provided by an exemplary embodiment of the present invention.
  • the server may be the FCC server 140 in the embodiment shown in FIG. 1 , where the server includes: a processor 31 and a network interface 32.
  • the processor 31 includes one or more processing cores, and the processor 31 executes various functional applications and data processing by running software programs and modules.
  • the cache 33 is coupled to the processor 31 for buffering media streams of at least one channel received from the multicast server 130 and may be coupled to the network interface 32 via a bus.
  • the memory 34 is coupled to the processor 31.
  • the memory 34 can be coupled to the processor 31 via a bus; the memory 23 can be used to store software programs and modules.
  • the memory 34 can store an application module 35 required for at least one function, and the application module 35 can include a transmitting module 351, a processing module 352, a receiving module 353, and the like.
  • the processor 31 executes the corresponding steps performed by the FCC server in FIG. 5 by running the transmitting module 351, the processing module 352, and the receiving module 353, as specifically described with respect to FIG.
  • the memory 34 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory ( English: electrically erasable programmable read-only memory (EEPROM), erasable programmable read only memory (EPROM), programmable read only memory (English: programmable read only memory, PROM), only Read memory (English: read only memory image, ROM), magnetic memory, flash memory, disk or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable programmable read only memory
  • PROM programmable read only memory
  • only Read memory English: read only memory image, ROM
  • magnetic memory magnetic memory
  • flash memory disk or optical disk.
  • FIG. 3 does not constitute a limitation to the server, and may include more or less components than those illustrated, or some components may be combined, or different component arrangements.
  • FIG. 4 is a schematic structural diagram of a terminal provided by an exemplary embodiment of the present invention.
  • the terminal can be the terminal 150 in the embodiment shown in FIG.
  • the terminal includes a processor 41, a network interface 42, and a memory 43.
  • the processor 41 includes one or more processing cores, and the processor 41 executes various functional applications and data processing by running software programs and modules.
  • the memory 43 is coupled to the processor 41.
  • the memory 43 can be coupled to the processor 41 via a bus; the memory 43 can be used to store software programs and modules.
  • the memory 43 can store an application module 44 required for at least one function, and the application module 44 can include a transmission module 441, a processing module 442, a receiving module 443, and the like.
  • the processor 41 executes the corresponding steps performed by the terminal in FIG. 5 by running the transmitting module 441, the processing module 442, and the receiving module 443, with specific reference to the description of FIG.
  • Memory 43 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static SRAM, EEPROM, EPROM, PROM, ROM, magnetic memory, flash memory, magnetic or optical disk.
  • the structure of the terminal shown in FIG. 4 does not constitute a limitation of the terminal, and may include more or less components than those illustrated, or a combination of certain components, or different component arrangements.
  • FIG. 5 illustrates a method for playing a media stream according to an exemplary embodiment of the present invention.
  • the method for playing a media stream is applied to an IPTV system as shown in FIG. 1.
  • the method for playing a media stream may include :
  • Step 501 The terminal sends a play request carrying the channel identifier of the target channel to the FCC server.
  • the terminal can send a play request to the FCC server in the following situations, including:
  • the terminal needs to send a play request to the FCC server when switching from the currently playing original channel to the target channel.
  • the user can select the target channel that needs to be switched.
  • the terminal detects the selected target channel, the channel identifier of the target channel is acquired, and a play request carrying the channel identifier of the target channel is sent to the FCC server.
  • the terminal When the terminal detects the selected target channel, the terminal also disconnects its connection with the multicast server to stop receiving the media stream of the original channel sent by the multicast server.
  • the terminal uses the preset channel as the target channel when starting, and sends a play request to the FCC server to request to join the target channel.
  • the preset channel can be the default channel set by the terminal at the factory, or the channel played by the terminal when it was last turned off. Therefore, the terminal can obtain the channel identifier of the default channel or the channel identifier of the channel played last time when the terminal is currently activated, and use the obtained channel identifier as the channel identifier of the target channel, and send the channel identifier carrying the target channel to the FCC server. Play the request.
  • Step 502 The FCC server receives the play request sent by the terminal, and sends the first video stream of the target channel to the terminal according to the channel identifier of the target channel carried by the play request, and the fragment identifier of a video stream segment included in the first video stream. .
  • the FCC server receives and caches the media stream of each channel sent by the multicast server in real time, and the media stream of each channel includes a first video stream, a second video stream, and at least one audio stream.
  • the step may be: the FCC server receives the play request sent by the terminal, and determines, according to the channel identifier of the target channel carried in the play request, the first video stream of the target channel buffered by the FCC server, and the first target channel buffered from the FCC server. Obtaining a key frame in the video stream that is located before the current time, and acquiring the first video stream from the time corresponding to the key frame to the current time in the first video stream of the target channel buffered by the FCC server, and sending the acquired video to the terminal. a first video stream and a playback response, the playback response carrying a slice identifier of the video stream segment included in the acquired first video stream.
  • the FCC server can obtain key frames in multiple ways. For example, a key frame whose time is before the current time and is closest to the current time can be obtained from the first video stream of the target channel buffered by the FCC server, or a frame is randomly acquired. Keyframes before the current time.
  • the FCC server may transmit the acquired first video stream by using a receiving rate of the first video stream that is greater than its receiving target channel. For example, in implementation, the FCC server determines the receiving rate of the first video stream of the target channel that is currently received by the multicast server, calculates the sending rate according to the preset first multiple and the receiving rate, and sends the sending rate to the terminal according to the sending rate. Send the acquired first video stream.
  • the first multiple is greater than 1, for example, the first multiple may be 1.4, 1.5, 1.6, etc., and the reception rate is multiplied by the first multiple to obtain the size of the transmission rate.
  • the order in which the FCC server sends the acquired first video and the sent playback response may be as follows: one is to first send a playback response, and then to send the obtained first video stream; A video stream and playback response.
  • the FCC server may transmit the playback response to the terminal while starting to transmit the acquired first video stream, or send the playback response to the terminal at some time after starting to send the first video stream.
  • the fragment identifier carried in the playback response is a fragment identifier of the nth video stream segment included in the acquired first video stream, where n is a positive integer greater than or equal to 1. Usually, n is 1 or 2 or 3.
  • the playback response may be an RTCP packet.
  • the RTCP packet includes a Default Bitrate/Adaptive Bitrate field, a current unicast synchronization source identifier (SSRC of Current Unicast Burst) field, User Internet Protocol (User IP) field, User RTCP Port field, Reserved field, Representation ID, Bandwidth field, and extended fragmentation
  • SSRC Current Unicast synchronization source identifier
  • User IP User Internet Protocol
  • User RTCP Port Reserved field
  • Representation ID Bandwidth field
  • extended fragmentation A Fragment Time Stamp field is used to carry the fragment identifier of a video stream fragment included in the acquired first video stream.
  • Step 503 The terminal receives the fragmentation identifier of the first video stream and the video stream fragment sent by the FCC server, and sends an acquisition request carrying the fragment identifier to the FCC server.
  • the implementation of this step may be: the terminal receives the first video stream and the play response sent by the FCC server, and extracts, from the play response, the fragment identifier of the video fragment included in the first video stream, and sends the packet to the FCC server. Get the request for the slice ID.
  • the first video stream includes description information of at least one audio stream.
  • the terminal may select description information of an audio stream from the description information of the at least one audio stream according to the capability information of the terminal, and the acquiring request may further carry the selected description information of the audio stream.
  • the capability information of the terminal may include a bandwidth size of the terminal, a terminal decoding capability, and/or a hardware resource of the terminal.
  • the acquisition request may or may not carry the description information of the audio stream of the one channel, or The acquisition request may or may not carry the description information of the selected default audio stream.
  • Step 504 The FCC server receives the acquisition request sent by the terminal, and determines, according to the fragment identifier in the acquisition request, a start time of the first video stream that has been sent to the terminal.
  • the FCC server receives the acquisition request sent by the terminal, and extracts the fragment identifier carried by the acquisition request, and the extracted fragment identifier is the segment of the nth video stream segment included in the first video stream that has been sent to the terminal.
  • a slice identifier determining, according to the extracted slice identifier, a slice identifier of the first video stream segment included in the first video stream, and acquiring the first video stream segment according to the slice identifier of the first video stream segment
  • the timestamp of the slice determines the timestamp of the first video stream fragment as the start time of the first video stream that has been sent to the terminal.
  • the extracted fragment identifier can be used as the fragment identifier of the first video stream fragment. If n is greater than 1, the slice identifier of the first video stream segment is calculated according to the extracted slice identifier and the value n.
  • the fragment identifier of the video stream fragment in the first video stream is continuously changed, and the difference between the fragment identifiers of the adjacent two video stream fragments is a value of 1, so that the fragment identifier is calculated for the foregoing.
  • the process may be as follows: if the fragment identifier extracted from the acquisition request is m, the fragment identifier of the first video stream fragment in the first video stream is mn-1.
  • Step 505 The FCC server acquires an audio stream corresponding to the first video stream according to the start time, and sends the audio stream to the terminal.
  • the first type is that the first video stream only includes description information of one audio stream, that is, the target channel only corresponds to one audio stream; For the terminal, the description information of the default audio stream agreed with the FCC server in advance is selected.
  • the FCC server determines an audio stream corresponding to the cached target channel, and obtains the audio stream from the start time to the current time from the cached one audio stream.
  • the acquired audio stream is sent to the terminal.
  • the FCC server determines the default audio stream corresponding to the description information of the default audio stream from the at least one audio stream corresponding to the cached target channel, and obtains the start time to the current source from the default audio stream.
  • the audio stream of time sends the acquired audio stream to the terminal.
  • the step may be: the FCC server determines, according to the at least one audio stream corresponding to the cached target channel, the audio stream corresponding to the description information of the audio stream, The determined audio stream is obtained from the start time to the current time, and the acquired audio stream is sent to the terminal.
  • the FCC server when transmitting the acquired audio stream, may transmit the audio stream with a reception rate greater than the frequency at which it received the audio stream. For example, in implementation, the FCC server determines the receiving rate of the audio stream that is currently received by the multicast server, calculates a sending rate according to the preset second multiple and the receiving rate, and sends the audio stream to the terminal according to the sending rate. .
  • the second multiple is greater than 1, for example, the first multiple may be 1.4, 1.5, 1.6, etc., and the reception rate is multiplied by the second multiple to obtain the size of the transmission rate.
  • the second multiple may be equal to or not equal to the first multiple.
  • Step 506 The terminal receives the audio stream sent by the FCC server, and provides the received video stream and the audio stream to the player on the terminal for playing.
  • the terminal Since the audio stream received by the terminal is the same as the start time of the video stream, the terminal simultaneously starts playing the audio stream and the video stream to ensure that the sound of the target channel is synchronized with the video picture.
  • Step 507 The FCC server sends a notification message to the terminal when stopping sending the first video stream to the terminal and starting to send the second video of the target channel to the terminal, where the notification message carries the sequence number and the second of the last data packet of the first video stream. The sequence number of the first packet of the video stream.
  • the FCC server continuously and continuously sends the first video stream to the terminal after starting to send the first video stream of the target channel to the terminal, and the FCC server also receives the first video stream sent by the multicast server.
  • the FCC server determines that the data packet currently sent to the terminal belongs to The video stream is fragmented, and the first video stream is stopped from being sent to the terminal when the determined video stream fragment is sent.
  • the FCC server sends the second video stream of the target channel to the terminal, and the first video stream fragment included in the second video stream is the next video stream fragment of the determined video stream fragment.
  • the data packet of the first video stream currently sent by the FCC server to the terminal is a data packet with sequence number 4, and it is assumed that the data packet of the first video stream currently received by the FCC server is also data with sequence number 4.
  • the FCC server determines that the video stream fragment to which the data packet of sequence number 4 belongs is S12, and the fragmentation identifier of the video stream fragment S12 is 2; when the FCC server sends the video stream fragment S12, it sends the packet to the terminal.
  • the second video stream, the first video stream fragment of the second video stream is S23, the fragment identifier of the video stream fragment S23 is 3, and the video stream fragment S12 (the fragment identifier is 2) is the next video stream. sheet.
  • the sequence number of the last data packet included in the first video stream is The sequence number of the first data packet included in the second video stream is discontinuous, so that the terminal mistakenly believes that packet loss occurs when playing the second video stream.
  • the FCC server starts to send the second channel of the target channel to the terminal.
  • the notification message is sent to the terminal, and the sequence number of the last data packet of the first video stream and the sequence number of the first data packet of the second video stream carried by the notification message are used to notify the terminal of the last data packet of the first video stream.
  • the first packet of the second video stream is a contiguous packet.
  • the last data packet of the video stream segment S12 has a sequence number of 4.
  • the FCC server sends the data packet
  • the second video stream is sent to the terminal, and the first video stream of the second video stream is sent.
  • the fragmentation is S23, and the sequence number of the first packet in the video stream fragment S23 is 9. Therefore, after receiving the data packet with sequence number 4, the terminal receives the data packet with sequence number 9. Since the terminal also receives the notification message carrying the sequence number 4 and the sequence number 9, it can be determined according to the message that the data packet with the sequence number 4 and the data packet with the sequence number 9 are consecutive data packets, and no loss occurs.
  • the FCC server may transmit the second video stream when it transmits the second video stream, less than the receiving rate at which it receives the second video stream. For example, in implementation, the FCC server determines the receiving rate of the second video stream of the target channel that is currently received by the multicast server, calculates the sending rate according to the preset third multiple and the receiving rate, and sends the sending rate to the terminal according to the sending rate. Send the acquired second video stream.
  • the third multiple is less than 1, for example, the third multiple may be 0.6, 0.7, 0.8, etc., and the reception rate is multiplied by the third multiple to obtain the size of the transmission rate.
  • the notification message may be an SCN message.
  • the SCN message includes a Final Optimal Adaptive Bitrate field, and an extended Last Sequence Number field and a new sequence number. (First Sequence Number) field, the old sequence number field is used to carry the sequence number of the last data packet, and the new sequence number field is used to carry the sequence number of the first data packet.
  • First Sequence Number First Sequence Number
  • Step 508 The terminal receives the notification message sent by the FCC server, and plays the second video stream sent by the FCC and the multicast group added to the target channel according to the notification message, and receives the second video stream and the audio stream sent by the multicast server.
  • the terminal receives the notification message sent by the FCC server, and also receives the second video stream sent by the FCC server, and obtains the sequence number of the last data packet of the first video stream carried in the notification message and the first data packet of the second video stream. Serial number.
  • the terminal determines, according to the sequence number of the last data packet of the first video stream and the sequence number of the first data packet of the second video stream, that the last data packet of the first video stream and the first data packet of the second video stream are consecutive The two packets are then played from the first packet of the second video stream.
  • the terminal When the terminal plays the second video stream, the terminal also sends a join request carrying the channel identifier of the target channel to the multicast server.
  • the multicast server receives the join request, adds the terminal to the multicast group of the target channel according to the channel identifier of the target channel, and then sends the second video stream and the audio stream of the target channel to the terminal.
  • the terminal receives and caches the second video stream and the audio stream sent by the multicast server after starting to play the second video stream sent by the FCC.
  • Step 509 When the data packet included in the second video stream currently played by the terminal is the same as the first data packet included in the second video stream sent by the multicast server, the second video stream and the audio stream sent by the multicast server are played.
  • the terminal When the terminal plays the second video stream and the audio stream sent by the multicast server, the terminal completes switching from the original channel to the target channel or the terminal joins the target channel after completing the startup. At the same time, the terminal can also disconnect the connection with the FCC server to stop receiving the second video stream and the audio stream of the FCC server transmitting the target channel.
  • the method for playing a media stream is provided by the terminal, and the terminal sends an acquisition request to the FCC server, where the acquisition request carries the video stream fragmentation in the video stream sent by the FCC server that has been received by the terminal.
  • the FCC server determines, according to the fragment identifier, a start time of the video stream that has been sent to the terminal, and obtains an audio stream corresponding to the video stream according to the start time, the audio stream start time and the video stream.
  • the start time is the same, the audio stream is sent to the terminal; since the terminal receives the video stream and the audio stream with the same start time sent by the FCC server, the terminal can ensure the audio and video synchronization when playing the video stream and the audio stream at the same time, and the solution is solved.
  • the terminal does not play the sound or the played sound is inconsistent with the video picture, and the sound played by the terminal is consistent with the video picture.
  • the video stream sent by the server to the terminal includes description information of at least one audio stream, so that the terminal selects description information of an audio stream that matches its own capability, and the acquisition request sent to the server includes description information of the one audio stream. In this way, the terminal can acquire an audio stream that matches its own capabilities for playback according to its own capabilities.
  • the server sends the RTCP packet to the terminal, and the fragmentation timestamp field of the RTCP packet carries the fragment identifier, so that the terminal sends an acquisition request carrying the fragment identifier to the server, so that the video received by the terminal is obtained from the server. Streams the corresponding audio stream.
  • the server sends a notification message to the terminal by stopping sending the first video stream to the terminal and transmitting the second video stream of the second code rate to the terminal, where the notification message carries the sequence number and the number of the last data packet of the first video stream.
  • the sequence number of the first data packet of the second video stream is such that the terminal determines, according to the notification message, that the last data packet and the first data packet are consecutive data packets, thereby preventing the terminal from being misjudged into a packet loss, thereby normalizing from the first A packet begins to play the second video stream.
  • FIG. 8 illustrates a server 800 provided by another exemplary embodiment of the present invention.
  • the server 800 may be an FCC server in the embodiment shown in FIG. 1 , FIG. 3 , and/or FIG. 5 .
  • the transmission unit 810, the processing unit 820, and the receiving unit 830 are included.
  • the sending unit 810 is configured to perform the functions of at least one of the foregoing steps 502, 505, and 507.
  • the processing unit 820 is configured to perform the functions of at least one of the foregoing steps 504 and 505.
  • the receiving unit 830 is used for the function of at least one of the foregoing steps 502 and 504.
  • the server provided by the foregoing embodiment provides the terminal with the media stream of the target channel that the terminal requests to play
  • only the division of the foregoing functional modules is illustrated.
  • the foregoing functions may be performed according to requirements.
  • the allocation is done by different functional modules, that is, the internal structure of the server is divided into different functional modules to complete all or part of the functions described above.
  • the server provided in the above embodiment is the same as the method embodiment of the method for playing the media stream, and the specific implementation process is described in the method embodiment, and details are not described herein again.
  • FIG. 9 illustrates a terminal 900 provided by another exemplary embodiment of the present invention.
  • the terminal 900 may be the terminal 900 in the embodiment shown in FIG. 1 , FIG. 4 and/or FIG. 5 , and the terminal includes : a transmitting unit 910, a processing unit 920, and a receiving unit 930.
  • the sending unit 910 is configured to perform the functions of at least one of the foregoing steps 501, 503, and 508.
  • the processing unit 920 is configured to perform the functions of at least one of the foregoing steps 506, 508, and 509.
  • the receiving unit 930 is configured to perform the functions of at least one of the foregoing steps 503, 506, and 508.
  • the terminal provided by the foregoing embodiment plays the media stream
  • only the division of the foregoing functional modules is illustrated.
  • the function distribution may be completed by different functional modules according to requirements, that is, the terminal is The internal structure is divided into different functional modules to perform all or part of the functions described above.
  • the method provided in the foregoing embodiment is the same as the method embodiment of the method for playing the media stream, and the specific implementation process is described in the method embodiment, and details are not described herein again.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La présente invention concerne le domaine des communications. La présente invention concerne un procédé, un serveur et un terminal pour lire un flux multimédia. Le procédé comprend les étapes suivantes : un terminal transmet une demande d'acquisition comportant un identifiant de fragment d'un fragment de flux vidéo, le fragment de flux vidéo étant un fragment de flux vidéo qu'un flux vidéo reçu par le terminal et transmis par le serveur comprend; le serveur acquiert le temps de début du flux vidéo sur la base de l'identifiant de fragment du fragment de flux vidéo; le serveur acquiert un flux audio correspondant au flux vidéo sur la base du temps de début; le serveur transmet le flux audio au terminal; et le terminal reçoit le flux audio et lit le flux vidéo et le flux audio. La présente invention résout le problème de l'état de la technique selon lequel l'un ou l'autre contenu audio ne peut pas être lu par un terminal ou bien le contenu audio lu est incohérent avec l'image vidéo et assure la cohérence entre le contenu audio et l'image vidéo lue par le terminal.
PCT/CN2018/079520 2017-03-21 2018-03-20 Procédé, serveur et terminal pour lire un flux multimédia WO2018171567A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710172615.6A CN108632681B (zh) 2017-03-21 2017-03-21 播放媒体流的方法、服务器及终端
CN201710172615.6 2017-03-21

Publications (1)

Publication Number Publication Date
WO2018171567A1 true WO2018171567A1 (fr) 2018-09-27

Family

ID=63586208

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079520 WO2018171567A1 (fr) 2017-03-21 2018-03-20 Procédé, serveur et terminal pour lire un flux multimédia

Country Status (2)

Country Link
CN (1) CN108632681B (fr)
WO (1) WO2018171567A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741314A (zh) * 2020-06-18 2020-10-02 聚好看科技股份有限公司 视频播放方法及显示设备
CN115474083B (zh) * 2022-11-02 2023-03-14 灵长智能科技(杭州)有限公司 一种多路音视频同步直播方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102802044A (zh) * 2012-06-29 2012-11-28 华为终端有限公司 视频处理方法、终端及字幕服务器
WO2015008490A1 (fr) * 2013-07-19 2015-01-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé de transmission, procédé de réception, dispositif de transmission et dispositif de réception
CN105872583A (zh) * 2015-11-20 2016-08-17 乐视网信息技术(北京)股份有限公司 一种多功能媒体播放方法及其装置
CN106210846A (zh) * 2016-08-15 2016-12-07 深圳Tcl新技术有限公司 音视频播放方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6587985B1 (en) * 1998-11-30 2003-07-01 Matsushita Electric Industrial Co., Ltd. Data transmission method, data transmission apparatus, data receiving apparatus, and packet data structure
CN102611894B (zh) * 2012-03-02 2015-01-07 华为技术有限公司 检测视频传输丢包的方法、装置和系统
CN103167359B (zh) * 2013-03-27 2016-03-02 华为技术有限公司 Rtp媒体流的传输方法及装置
CN106488265A (zh) * 2016-10-12 2017-03-08 广州酷狗计算机科技有限公司 一种发送媒体流的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102802044A (zh) * 2012-06-29 2012-11-28 华为终端有限公司 视频处理方法、终端及字幕服务器
WO2015008490A1 (fr) * 2013-07-19 2015-01-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédé de transmission, procédé de réception, dispositif de transmission et dispositif de réception
CN105872583A (zh) * 2015-11-20 2016-08-17 乐视网信息技术(北京)股份有限公司 一种多功能媒体播放方法及其装置
CN106210846A (zh) * 2016-08-15 2016-12-07 深圳Tcl新技术有限公司 音视频播放方法及系统

Also Published As

Publication number Publication date
CN108632681A (zh) 2018-10-09
CN108632681B (zh) 2020-04-03

Similar Documents

Publication Publication Date Title
US11800200B2 (en) Low latency media ingestion system, devices and methods
US10743038B2 (en) Live broadcast processing method, apparatus, device, and storage medium
KR102280134B1 (ko) 비디오 재생 방법, 장치 및 시스템
KR102387161B1 (ko) 비디오 스크린 프로젝션 방법과 장치, 컴퓨터 장비, 및 저장 매체
US20250203144A1 (en) Facilitating WATCH PARTIES
JP2024519363A (ja) ゲームデータの処理方法、装置、コンピュータ機器、及びコンピュータプログラム
US11109092B2 (en) Synchronizing processing between streams
EP3515083B1 (fr) Procédé et appareil pour effectuer une opération de synchronisation sur un contenu
US10645447B2 (en) Fast channel change method and server, and IPTV system
CN120512562A (zh) 用于检索和发送媒体数据的方法、设备及介质
CN108347622B (zh) 多媒体数据推送方法、装置、存储介质及设备
CN107409234A (zh) 基于lct利用dash格式的基于文件格式的流式传输
CN113141522B (zh) 资源传输方法、装置、计算机设备及存储介质
CN103004224A (zh) 在流会话中指示切换点的方法和装置
US12254044B2 (en) Video playing method, apparatus, and system, and computer storage medium
WO2023115906A1 (fr) Procédé de lecture de vidéo et dispositif associé
US9049481B2 (en) Fine-tuning the time for leaving/joining a multicast session during channel changes
WO2018028547A1 (fr) Procédé et dispositif de commutation de canal
CN111866526B (zh) 一种直播业务处理方法和装置
CN108494792A (zh) 一种flash播放器播放hls视频流的转换系统及其工作方法
WO2018171567A1 (fr) Procédé, serveur et terminal pour lire un flux multimédia
CN110022286B (zh) 点播多媒体节目的方法和装置
Haems et al. Enabling adaptive and reliable video delivery over hybrid unicast/broadcast networks
CN110798725A (zh) 一种数据处理方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18770489

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18770489

Country of ref document: EP

Kind code of ref document: A1