Disclosure of Invention
      In view of this, the embodiment of the present application provides a scheme for selecting a backbone network in real-time audio and video transmission. The scheme comprises a method, a device and a system for selecting a backbone network in audio and video real-time transmission. The scheme can give consideration to the audio and video experience quality and the operation cost of real-time audio and video conversation, so that a user can obtain good audio and video experience quality at lower operation cost.
      In a first aspect, a method for selecting a backbone network in real-time audio and video transmission is provided, which is performed by a user terminal. The user terminal inquires a first corresponding table according to the current audio and video experience quality when a media engine playing the audio and video transmits the audio and video on a real-time transmission network, and obtains the first audio and video experience quality corresponding to the current audio and video experience quality. The audio and video experience quality is used for expressing the experience quality of a user on the audio and video, and is divided into a plurality of gears in the first corresponding table. The first corresponding table is used for recording the corresponding relation of audio and video experience quality under each gear when the real-time transmission network is switched to the peer-to-peer network, and the first audio and video experience quality is the audio and video experience quality after the real-time transmission network is correspondingly switched to the peer-to-peer network. And the user terminal inquires a second corresponding table according to the first audio and video experience quality to acquire a first link service quality corresponding to the first audio and video experience quality. The first link service quality is used for expressing the link communication quality of a backbone network for transmitting the audio and video, and the second corresponding table is used for recording the corresponding relation between the audio and video experience quality and the link service quality. The user terminal obtains the current link service quality of the peer-to-peer network; and determining whether the current link service quality reaches the standard according to the first link service quality. And when the current link service quality is determined to meet the standard, switching the audio and video to the peer-to-peer network for transmission.
      With reference to the first aspect, in a possible implementation manner, before switching the audio/video to the peer-to-peer network for transmission, the user terminal further includes: and when the service quality of the current link is determined to meet the standard, the audio and video experience quality of the media engine is set as the first audio and video experience quality.
      With reference to the first aspect, in a possible implementation manner, before querying a first mapping table according to a current audio/video experience quality of an audio/video when a media engine playing the audio/video transmits the audio/video over a real-time transmission network and acquiring a first audio/video experience quality corresponding to the current audio/video experience quality, a user terminal determines a backbone network in a network starting playing the audio/video according to a call time of initially calling the peer-to-peer network.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, the determining, by the user terminal, a backbone network in the network for starting playing the audio and video according to a call time for starting a call to the peer-to-peer network specifically includes: when the calling time is larger than a first time threshold value, determining the backbone network as a real-time transmission network; when the call time is less than the first time threshold, the backbone network is determined to be a peer-to-peer network.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, after determining a backbone network in a network for starting the audio/video, the user terminal transmits the audio/video on the network according to the determined backbone network, and sets the audio/video experience quality of the media engine to a predetermined initial audio/video experience quality, so that an access network in the network can smoothly start the audio/video.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, after switching the audio/video to the peer-to-peer network for transmission, the user terminal measures a first delay jitter and/or a first packet loss rate of the peer-to-peer network; and changing the gear of the audio and video experience quality of the media engine according to the first time delay jitter and/or the first packet loss rate.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, the changing, by the user terminal, the gear of the audio/video experience quality of the media engine according to the first delay jitter and/or the first packet loss rate specifically includes: when the first time delay jitter is higher than a first time delay jitter threshold value or the first packet loss rate is higher than a first packet loss rate threshold value, reducing the gear of the audio and video experience quality; and when the first time delay jitter is lower than the first time delay jitter threshold and the first packet loss rate is lower than the first packet loss rate threshold, improving the gear of the audio and video experience quality.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, after the audio/video is switched to the peer-to-peer network for transmission, when the experience quality of the audio/video when the media engine transmits the audio/video on the peer-to-peer network is lower than a predetermined switching threshold value for switching a real-time transmission network, the user terminal obtains the network quality of an access network in the network. And when the user terminal determines that the network quality of the access network does not influence the audio and video experience quality, acquiring a second audio and video experience quality after the scheduled real-time transmission network switching. And the user terminal sets the audio and video experience quality of the media engine to be the second audio and video experience quality. And the user terminal switches the audio and video to the real-time transmission network for transmission.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, after the audio/video is switched to the real-time transmission network for transmission, the user terminal measures a second delay jitter and/or a second packet loss rate of the real-time transmission network; and changing the gear of the audio and video experience quality of the media engine according to the second time delay jitter and/or the second packet loss rate.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, the changing, by the user terminal, the gear of the audio/video experience quality of the media engine according to the second delay jitter and/or the second packet loss rate specifically includes: when the second time delay jitter is higher than a second time delay jitter threshold or the second packet loss rate is higher than a second packet loss rate threshold, reducing the gear of the audio and video experience quality; and when the second time delay jitter is lower than a second time delay jitter threshold value and the second packet loss rate is lower than a second packet loss rate threshold value, improving the gear of the audio and video experience quality.
      With reference to the first aspect or any one of the possible implementation manners of the first aspect, in a possible implementation manner, the quality of audio/video experience further includes a video mean opinion score, where the video mean opinion score is determined according to the video quality, the operation experience, and the playing experience, and is used to reflect a subjective video experience of a user.
      In a second aspect, an apparatus for selecting a backbone network in real-time audio and video transmission is provided, and the apparatus is disposed in a user terminal. The device includes: the device comprises a first acquisition unit, a second acquisition unit, a third acquisition unit, a first determination unit and a first switching unit. The first acquisition unit is configured to: according to the current audio and video experience quality when a media engine playing the audio and video transmits the audio and video on a real-time transmission network, inquiring a first corresponding table to obtain a first audio and video experience quality corresponding to the current audio and video experience quality, wherein the audio and video experience quality is used for expressing the experience quality of a user on the audio and video, the first corresponding table is divided into a plurality of gears, the first corresponding table is used for recording the corresponding relation of the audio and video experience quality under each gear when the real-time transmission network is switched to an equality network, and the first audio and video experience quality is the audio and video experience quality after the real-time transmission network is correspondingly switched to a peer-to-peer network. The second acquisition unit is configured to: and inquiring a second corresponding table according to the first audio and video experience quality to acquire a first link service quality corresponding to the first audio and video experience quality, wherein the first link service quality is used for expressing the link communication quality of a backbone network for transmitting the audio and video, and the second corresponding table is used for recording the corresponding relation between the audio and video experience quality and the link service quality. The third acquisition unit is configured to: the current link quality of service of the peer-to-peer network is obtained. The first determination unit is configured to: and determining whether the current link service quality reaches the standard according to the first link service quality. The first switching unit is configured to: and when the current link service quality is determined to meet the standard, switching the audio and video to the peer-to-peer network for transmission.
      With reference to the second aspect, in one possible implementation, the apparatus further includes a first setting unit. The first setting unit is configured to: and when the service quality of the current link is determined to meet the standard, the audio and video experience quality of the media engine is set as the first audio and video experience quality.
      With reference to the second aspect, in one possible implementation manner, the apparatus further includes a second determining unit. The second determination unit is configured to: and determining a backbone network in the network for starting the audio and video according to the calling time for calling the peer-to-peer network at the beginning.
      With reference to the second aspect or any possible implementation manner of the second aspect, in a possible implementation manner, the second determining unit specifically includes a first determining module and a second determining module. The first determination module is configured to: and when the calling time is greater than a first time threshold value, determining the backbone network as a real-time transmission network. The second determination module is configured to: when the call time is less than the first time threshold, the backbone network is determined to be a peer-to-peer network.
      With reference to the second aspect or any one of the possible embodiments of the second aspect, in one possible embodiment, the apparatus further includes a second setting unit. The second setting unit is configured to: and transmitting the audio and video on the network according to the determined backbone network, and setting the audio and video experience quality of the media engine as a preset initial audio and video experience quality so that an access network in the network can smoothly start playing the audio and video.
      With reference to the second aspect or any one of the possible embodiments of the second aspect, in one possible embodiment, the apparatus further includes a first measurement unit and a first changing unit. The first measurement unit is configured to: a first delay jitter and/or a first packet loss rate of the peer-to-peer network is measured. The first changing unit is configured to: and changing the gear of the audio and video experience quality of the media engine according to the first time delay jitter and/or the first packet loss rate.
      With reference to the second aspect or any one of the possible embodiments of the second aspect, in one possible embodiment, the first changing unit specifically includes a first lowering module and a first raising module. The first reduction module is configured to: and when the first time delay jitter is higher than a first time delay jitter threshold value or the first packet loss rate is higher than a first packet loss rate threshold value, reducing the gear of the audio and video experience quality. The first boost module is configured to: and when the first time delay jitter is lower than the first time delay jitter threshold and the first packet loss rate is lower than the first packet loss rate threshold, improving the gear of the audio and video experience quality.
      With reference to the second aspect or any possible implementation manner of the second aspect, in a possible implementation manner, the apparatus further includes a fourth acquiring unit, a fifth acquiring unit, a third setting unit, and a second switching unit. The fourth acquisition unit is configured to: and when the audio and video experience quality of the media engine in the audio and video transmission on the peer-to-peer network is lower than a switching threshold value of a preset switching real-time transmission network, acquiring the network quality of an access network in the network. The fifth acquisition unit is configured to: and when the network quality of the access network is determined not to influence the audio and video experience quality, acquiring a second audio and video experience quality after the scheduled real-time transmission network switching. The third setting unit is configured to: and setting the audio and video experience quality of the media engine as the second audio and video experience quality. And the second switching unit is configured to: and switching the audio and video to the real-time transmission network for transmission.
      With reference to the second aspect or any one of the possible embodiments of the second aspect, in one possible embodiment, the apparatus further includes a second measuring unit and a second changing unit. The second measurement unit is configured to: and measuring the second time delay jitter and/or the second packet loss rate of the real-time transmission network. And the second changing unit is configured to: and changing the gear of the audio and video experience quality of the media engine according to the second time delay jitter and/or the second packet loss rate.
      With reference to the second aspect or any one of the possible embodiments of the second aspect, in one possible embodiment, the second changing unit specifically includes a second lowering module and a second raising module. The second reduction module is configured to: and when the second time delay jitter is higher than a second time delay jitter threshold or the second packet loss rate is higher than a second packet loss rate threshold, reducing the gear of the audio and video experience quality. And, the second boosting module is configured to: and when the second time delay jitter is lower than a second time delay jitter threshold value and the second packet loss rate is lower than a second packet loss rate threshold value, improving the gear of the audio and video experience quality.
      With reference to the second aspect or any one of the possible implementation manners of the second aspect, in a possible implementation manner, the quality of the audio-video experience further includes a video mean opinion score, where the video mean opinion score is determined according to the video quality, the operation experience, and the playing experience, and is used for reflecting the subjective video experience of the user.
      In a third aspect, a system for selecting a backbone network in real-time transmission of audio and video is provided. The system comprises a user terminal on which the apparatus described in the second aspect and any one of the possible designs of the second aspect is provided.
      In a fourth aspect, a computing device is provided. The computing device includes a processor and a memory. The memory is for storing computer instructions. The processor is configured to execute the computer instructions stored by the memory to cause the computing device to perform the method described in the first aspect and any one of the possible designs of the first aspect.
      In a fifth aspect, a computer program product is provided. The computer program product comprises computer instructions for instructing a computing device to perform the method as set forth in the first aspect and any one of the possible designs of the first aspect.
    
    
      Detailed Description
      Fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification. In this implementation scenario, a user uses his user terminal to perform real-time audio/video call, which specifically includes two sub-scenarios: when the call is initiated; and in the course of a call.
      When the call is initiated, the user terminal firstly judges whether P2P successfully passes through within the preset time, if the P2P successfully passes through within the preset time, the audio and video is transmitted on the P2P backbone network, otherwise, the audio and video is transmitted on the RTN backbone network. At this time, the user terminal configures the quality of experience (QoE) of the audio and video of the media engine at a lower level, thereby ensuring that the access network can smoothly start the call under various communication quality environments. After the call is started, the media engine adaptively increases and decreases the QoE according to the quality of service (QoS) of the currently used real-time link of the backbone network, and the best QoE is expected to be achieved.
      In the conversation process, when the audio and video is transmitted on the RTN, the user terminal detects and evaluates the link QoS of the P2P at any time, when the link QoS of the P2P reaches the standard, the user terminal adjusts the QoE of the media engine to a gear suitable for transmission on the P2P, and simultaneously switches a backbone network for transmitting the audio and video to the P2P. The media engine then adaptively up-shifts its QoE based on the real-time link QoS of P2P, expecting an optimal QoE to be achieved. When the QoE of the media engine is adaptively downshifted until the QoE drops to P2P and falls back to the RTN position, the user terminal judges whether the downshifting is caused by the access network. And when the user terminal is not caused by the access network, the user terminal switches the backbone network for transmitting the audio and video to the RTN. The media engine then adaptively upshifts its QoE based on the real-time link QoS of the RTN, expecting an optimal QoE to be achieved.
      The process of selecting to transmit audio and video over the RTN at the time of initial call, then switching from the RTN to P2P and then from P2P to the RTN according to the link QoS of real-time P2P and QoE of the media engine is described in detail above. It should be understood that the procedure of selecting to transmit audio and video on P2P when a call is initiated, then switching from P2P to RTN and then switching from RTN to P2P according to the link QoS of P2P and QoE of the media engine in real time is similar to the above and will not be described herein again.
      It should be noted that the QoE mentioned herein is used to represent the quality of experience of the user on the audio/video, and includes resolution, frame rate, and bitrate. Preferably, a Video Mean Opinion Score (VMOS) is also included. The resolution is the number of pixels included in a unit inch, and reflects the sharpness of an image. Frame rate, which is the frequency (rate) at which bitmap images called frames appear continuously on the display, reflects the fluency of the video. The code rate refers to the number of data bits transmitted per unit time during data transmission. The video mean opinion score, i.e., the VMOS, reflects the user's subjective video experience, determined based on video quality, operational experience, and playback experience.
      The QoS mentioned herein is used to represent the link communication quality of the backbone network for transmitting the audio and video, including bandwidth, delay jitter and packet loss rate. Bandwidth, which is the amount of data that can pass through a link per unit time, is usually expressed in bps. Latency, refers to the time required for a message or packet to travel from one end of a network to the other. Delay jitter refers to the variation of delay. The packet loss rate is the ratio of the number of lost packets in the transmitted data set.
      As a rule of thumb, a QoE can be divided into a number of different gears, each corresponding to a QoS required to meet the QoE.
      The following will describe a specific implementation process of the user terminal selecting the backbone network in the real-time audio and video transmission.
      Fig. 2 is a schematic flow chart of a method for selecting a backbone network in real-time audio/video transmission according to an embodiment of the present application, where the method is executed by a user terminal and may include the following steps:
      s201, a user terminal inquires a first corresponding table according to the current audio and video QoE when a media engine playing the audio and video transmits the audio and video on a real-time transmission network, and obtains the first audio and video QoE corresponding to the current audio and video QoE.
      Specifically, the audio/video QoE is used for representing the experience quality of the user on the audio/video, and is divided into a plurality of gears in a first corresponding table. The first corresponding table is used for recording audio and video QoE corresponding relations under each gear when the real-time transmission network is switched to the peer-to-peer network. Optionally, the first audio/video QoE is the audio/video QoE after being correspondingly switched to the peer-to-peer network.
      Optionally, the first mapping table is table 1, and the QoE mapping table when the RTN switches to P2P. As shown in table 1, under the same coding type, the QoE is divided into a plurality of gears, and the RTN current resolution, the RTN current frame rate, and the RTN current code rate of each gear correspond to one of the resolution after switching to P2P, the frame rate after switching to P2P, and the code rate after switching to P2P, respectively.
      Alternatively, when rtnR1 ═ 1080p, rtnF1 ═ 30fps, and rttx 1 ═ 2250kbps in table 1, p2pR1 ═ 1080p, p2pF1 ═ 25fps, and p2pTx1 ═ 1875kbps, respectively, in table 1.
      Alternatively, when rtnR1 ═ 1080p, rtnF1 ═ 30fps, and rttx 1 ═ 2250kbps in table 1, p2pR1 ═ 720p, p2pF1 ═ 25fps, and rttx 1 ═ 1000kbps, respectively, in table 1.
      Alternatively, when rtnR1 ═ 1080p, rtnF1 ═ 30fps, and rttx 1 ═ 2250kbps in table 1, p2pR1 ═ 540p, p2pF1 ═ 25fps, and p2pTx1 ═ 705kbps, respectively, in table 1.
      It should be understood that the specific values in table 1 can be set according to the business needs by using corresponding formulas and empirical values.
      TABLE 1 QoE mapping Table when RTN switches to P2P
      
      Optionally, before S201, when the real-time audio/video call is started, the user terminal determines to start a backbone network in the audio/video network according to the call time of the peer-to-peer network. For example, when the call time is greater than a first time threshold, the user terminal determines the backbone network as a real-time transport network. For another example, when the call time is less than the first time threshold, the user terminal determines that the backbone network is a peer-to-peer network. It should be understood that the first time threshold may be set based on empirical values.
      The backbone network for starting the broadcast video is determined by judging the calling time, so that the broadcast video can be started smoothly, and the operation cost is also considered.
      Optionally, in order to start playing the audio and video smoothly and ensure that the call can be initiated in various quality environments of an access network in the network, the user terminal sets the audio and video QoE of the media engine to a predetermined initial audio and video QoE, where the initial audio and video QoE is located at a lower gear, and for example, the predetermined initial resolution, the predetermined initial frame rate, and the predetermined initial bit rate are 540p, 25fps, and 705kbps, respectively. It should be understood that the specific values of the predetermined initial audio video QoE described above are for exemplary purposes only. The initial audio video QoE may be set to other values according to the link condition of the access network.
      By setting the audio and video QoE at a lower gear, the audio and video can be smoothly played under various quality environments of an access network.
      And S202, the user terminal inquires a second corresponding table according to the first audio and video QoE and acquires a first link QoS corresponding to the first audio and video QoE.
      Specifically, the link QoS is used to represent link communication quality of a backbone network for transmitting audio and video, and the second mapping table is used to record a corresponding relationship between the audio and video QoE and the link QoS.
      Optionally, the second mapping table is table 2, and the audio/video QoE corresponds to the link QoS. As shown in table 2, the QoE and QoS are divided into a plurality of gears under the same coding type. According to the service requirement, the specific numerical values of resolution, frame rate, code rate, bandwidth, time delay jitter and packet loss rate corresponding to each gear can be set by adopting a corresponding formula and empirical values.
      Alternatively, when R1, F1, and Tx11 in table 2 take 1080p, 30fps, and 2250kbps, respectively, Bx11 is 4050kbps, Dx11 is 200ms, Jx11 is 200ms, and Lx11 is 30%.
      Optionally, when R2, F2, and Tx21 in table 2 take values of 720p, 30fps, and 1180kbps, respectively, Bx21 is 1888kbps, Dx21 is 200ms, Jx21 is 200ms, and Lx21 is 30%.
      Optionally, the values of the resolution, the frame rate, and the code rate of the first audio/video QoE measured by the user terminal are 720p, 30fps, and 1180kbps, respectively, and the values of the bandwidth, the delay jitter, and the packet loss rate of the first link QoS corresponding to the first audio/video QoE obtained by querying the table 2 are 1888kbps, 200ms, and 30%, respectively.
      Table 2 audio video QoE and link QoS corresponding table
      
      Optionally, the second mapping table is table 3, and the audio/video QoE corresponds to the link QoS. Compared with table 2, in table 3, the audio and video QoE further includes a video mean opinion score, that is, a VMOS, for example, whose value is 2.5 to 3. Other values in table 3 are similar to those in table 2 and are not described herein again.
      Table 3 audio video QoE and link QoS corresponding table
      
      S203, the user terminal acquires the current link QoS of the peer-to-peer network.
      Specifically, the ue measures the link of the peer-to-peer network to obtain the current link QoS of the peer-to-peer network, for example, the bandwidth, the delay jitter, and the packet loss rate of the current link QoS are 2000kbps, 80ms, 50ms, and 10%, respectively.
      S204, the user terminal determines whether the current link QoS reaches the standard according to the first link QoS.
      Specifically, when the current link QoS measured by the user terminal reaches or exceeds the first link QoS, it is determined that the current link QoS has reached the standard. For example, as described above, the user terminal queries the table 2, and the queried values of the bandwidth, delay jitter, and packet loss rate of the first link QoS are > 1888kbps, <200ms, and < 30%; the current link QoS bandwidth, time delay jitter and packet loss rate of the peer-to-peer network obtained by the user terminal through measurement are 2000kbps, 80ms, 50ms and 10% respectively. At this time, the current link QoS has reached the first link QoS, and then the current link QoS is determined to reach the standard. And vice versa.
      S205, when the user terminal determines that the QoS of the current link of the peer-to-peer network reaches the standard, the audio and video are switched to the peer-to-peer network for transmission.
      Specifically, when the user terminal determines that the current link QoS of the peer-to-peer network reaches the standard, the audio/video QoE of the media engine is set as the first audio/video QoE, and then the audio/video is switched to the peer-to-peer network for transmission.
      When the current link QoS of the peer-to-peer network reaches the standard, the user terminal sets the audio and video QoE of the media engine to be the first audio and video QoE, so that the audio and video can be transmitted smoothly on the peer-to-peer network under the current link QoS condition of the peer-to-peer network, and the method has high definition and high fluency. In addition, the audio and video are switched from the real-time transmission network to the peer-to-peer network for transmission, so that the operation cost is reduced.
      Optionally, when transmitting the audio and video over the peer-to-peer network, the user terminal measures a first delay jitter and/or a first packet loss rate of the peer-to-peer network, and changes the gear of the audio and video QoE of the media engine according to the measured first delay jitter and/or first packet loss rate.
      Optionally, when the measured first delay jitter is lower than a first delay jitter threshold and the first packet loss rate is lower than a first packet loss rate threshold, the gear of the audio and video QoE is improved.
      Optionally, when the measured first delay jitter is higher than the first delay jitter threshold, the gear of the audio and video QoE is reduced.
      Optionally, when the measured first packet loss rate is higher than a first packet loss rate threshold, the gear of the audio and video QoE is reduced.
      The user terminal can change the gear of the audio and video QoE of the media engine in real time by measuring the time delay jitter and/or the packet loss rate of the peer-to-peer network, so that the optimal audio and video QoE is obtained.
      Optionally, when the audio and video QoE of the media engine transmitting the audio and video on the peer-to-peer network is lower than a predetermined switching threshold value for switching the real-time transmission network, the user terminal first obtains the network quality of the access network transmitting the audio and video, and when it is found that the network quality of the access network affects the audio and video QoE, it is determined that the audio and video is continuously transmitted on the peer-to-peer network. Otherwise, the user terminal queries and obtains a second audio and video QoE after the scheduled real-time transmission network switching. And then, setting the audio and video QoE of the media engine as the second audio and video QoE, and switching the audio and video to a real-time transmission network for transmission.
      The method comprises the steps of firstly judging the network quality of the access network, and switching the audio and video from the peer-to-peer network to a real-time transmission network for transmission when the network quality of the access network does not affect the audio and video QoE, so that the higher operation cost caused by the error switching due to the poor network quality of the access network can be avoided, and the better audio and video QoE can be ensured.
      Optionally, when the audio and video is transmitted on the real-time transmission network, the user terminal measures a second delay jitter and/or a second packet loss rate of the real-time transmission network, and changes the gear of the audio and video QoE of the media engine according to the measured second delay jitter and/or second packet loss rate.
      Optionally, when the measured second delay jitter is lower than a second delay jitter threshold and the second packet loss rate is lower than a second packet loss rate threshold, the gear of the audio and video QoE is improved.
      Optionally, when the measured second delay jitter is higher than a second delay jitter threshold, the gear of the audio and video QoE is reduced.
      Optionally, when the measured second packet loss rate is higher than a second packet loss rate threshold, the gear of the audio and video QoE is reduced.
      The user terminal can change the gear of the audio and video QoE of the media engine in real time by measuring the time delay jitter and/or the packet loss rate of the real-time transmission network, so that the optimal audio and video QoE is obtained.
      It should be understood that the above-mentioned "first delay jitter threshold", "second delay jitter threshold", "first packet loss rate threshold", and "second packet loss rate threshold" may be set according to an empirical value, and are not limited herein. The terms "first" and "second" are only names used for the "threshold value", and are not limited to other meanings.
      The method for selecting the backbone network in the real-time audio and video transmission according to the embodiment of the present application is described above in detail, and an apparatus for selecting the backbone network in the real-time audio and video transmission according to the embodiment of the present application is described below in detail.
      Fig. 3 is a schematic structural diagram of a device for selecting a backbone network in real-time audio/video transmission according to an embodiment of the present disclosure, where the device is disposed in a user terminal. As shown in fig. 3, the apparatus 300 includes a first acquisition unit 301, a second acquisition unit 302, a third acquisition unit 303, a first determination unit 304, and a first switching unit 305.
      The first acquisition unit 301 is configured to: and inquiring a first corresponding table according to the current audio and video experience quality when a media engine playing the audio and video transmits the audio and video on a real-time transmission network, and acquiring the first audio and video experience quality corresponding to the current audio and video experience quality. The audio and video experience quality is used for expressing the experience quality of a user on the audio and video, and is divided into a plurality of gears in the first corresponding table. The first correspondence table is used for recording the corresponding relation of the audio/video experience quality at each gear when the real-time transmission network is switched to the peer-to-peer network, for example, table 1 described above, which is not described herein again. The first audio and video experience quality is the audio and video experience quality after the corresponding switching to the peer-to-peer network.
      The second obtaining unit 302 is configured to: and inquiring a second corresponding table according to the first audio and video experience quality to acquire a first link service quality corresponding to the first audio and video experience quality. The first link service quality is used for representing the link communication quality of a backbone network for transmitting the audio and video. The second correspondence table is used for recording a correspondence between audio/video experience quality and link service quality, for example, table 2 or table 3 described above, which is not described herein again.
      The third acquisition unit 303 is configured to: the current link quality of service of the peer-to-peer network is obtained.
      The first determination unit 304 is configured to: and determining whether the current link service quality reaches the standard according to the first link service quality.
      The first switching unit 305 is configured to: and when the current link service quality is determined to meet the standard, switching the audio and video to the peer-to-peer network for transmission.
      Optionally, the apparatus 300 further comprises a first setting unit. The first setting unit is configured to: and when the service quality of the current link is determined to meet the standard, the audio and video experience quality of the media engine is set as the first audio and video experience quality.
      Optionally, the apparatus 300 further comprises a second determining unit. The second determination unit is configured to: and determining a backbone network in the network for starting the audio and video according to the calling time for calling the peer-to-peer network at the beginning.
      Specifically, the second determination unit includes a first determination module and a second determination module. The first determination module is configured to: and when the calling time is greater than a first time threshold value, determining the backbone network as a real-time transmission network. The second determination module is configured to: when the call time is less than the first time threshold, the backbone network is determined to be a peer-to-peer network.
      Optionally, the apparatus 300 further comprises a second setting unit. The second setting unit is configured to: and transmitting the audio and video on the network according to the determined backbone network, and setting the audio and video experience quality of the media engine as a preset initial audio and video experience quality so that an access network in the network can smoothly start playing the audio and video.
      Optionally, the apparatus 300 further comprises a first measuring unit and a first changing unit. The first measurement unit is configured to: a first delay jitter and/or a first packet loss rate of the peer-to-peer network is measured. The first changing unit is configured to: and changing the gear of the audio and video experience quality of the media engine according to the first time delay jitter and/or the first packet loss rate.
      Specifically, the first changing unit includes a first lowering module and a first raising module. The first reduction module is configured to: and when the first time delay jitter is higher than a first time delay jitter threshold value or the first packet loss rate is higher than a first packet loss rate threshold value, reducing the gear of the audio and video experience quality. The first boosting module is configured to: and when the first time delay jitter is lower than the first time delay jitter threshold and the first packet loss rate is lower than the first packet loss rate threshold, improving the gear of the audio and video experience quality.
      Optionally, the apparatus 300 further includes a fourth acquiring unit, a fifth acquiring unit, a third setting unit, and a second switching unit. The fourth acquisition unit is configured to: and when the audio and video experience quality of the media engine during the audio and video transmission on the peer-to-peer network is lower than a switching threshold value of a preset switching real-time transmission network, acquiring the network quality of an access network in the network. The fifth acquisition unit is configured to: and when the network quality of the access network is determined not to influence the audio and video experience quality, acquiring a second audio and video experience quality after the scheduled real-time transmission network switching. The third setting unit is configured to: and setting the audio and video experience quality of the media engine as the second audio and video experience quality. The second switching unit is configured to: and switching the audio and video to the real-time transmission network for transmission.
      Optionally, the apparatus 300 further comprises a second measuring unit and a second changing unit. The second measurement unit is configured to: and measuring the second time delay jitter and/or the second packet loss rate of the real-time transmission network. The second changing unit is configured to: and changing the gear of the audio and video experience quality of the media engine according to the second time delay jitter and/or the second packet loss rate.
      Specifically, the second changing unit includes a second lowering module and a second raising module. The second reduction module is configured to: and when the second time delay jitter is higher than a second time delay jitter threshold or the second packet loss rate is higher than a second packet loss rate threshold, reducing the gear of the audio and video experience quality. The second boost module is configured to: and when the second time delay jitter is lower than a second time delay jitter threshold value and the second packet loss rate is lower than a second packet loss rate threshold value, improving the gear of the audio and video experience quality.
      In one embodiment, a system for selecting a backbone network in real-time audio and video transmission is further provided, and the system comprises a user terminal, and the user terminal is provided with the device shown in fig. 3.
      In one embodiment, a computing device is also provided that includes a processor and a memory. The memory stores computer instructions. The processor executes the computer instructions stored by the memory, causing the computing device to perform the method illustrated in fig. 2.
      In one embodiment, a computer program product is also provided. The computer program product includes computer instructions that instruct a computing device to perform the method illustrated in fig. 2.
      In the various embodiments of the present application described above, all or part of the implementation may be implemented by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable medium to another computer readable medium, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.
      The above-mentioned embodiments, objects, technical solutions and advantages of the present application are further described in detail, it should be understood that the above-mentioned embodiments are only examples of the present application, and are not intended to limit the scope of the present application, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present application should be included in the scope of the present application.