[go: up one dir, main page]

CN114997560B - User video quality difference root cause analysis method, electronic equipment and storage medium - Google Patents

User video quality difference root cause analysis method, electronic equipment and storage medium Download PDF

Info

Publication number
CN114997560B
CN114997560B CN202210372229.2A CN202210372229A CN114997560B CN 114997560 B CN114997560 B CN 114997560B CN 202210372229 A CN202210372229 A CN 202210372229A CN 114997560 B CN114997560 B CN 114997560B
Authority
CN
China
Prior art keywords
data
complaint
knowledge base
user
network server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210372229.2A
Other languages
Chinese (zh)
Other versions
CN114997560A (en
Inventor
梁哲
邢祥宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zznode Technology Co ltd
Original Assignee
Beijing Zznode Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zznode Technology Co ltd filed Critical Beijing Zznode Technology Co ltd
Priority to CN202210372229.2A priority Critical patent/CN114997560B/en
Publication of CN114997560A publication Critical patent/CN114997560A/en
Application granted granted Critical
Publication of CN114997560B publication Critical patent/CN114997560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/4424Monitoring of the internal components or processes of the client device, e.g. CPU or memory load, processing speed, timer, counter or percentage of the hard disk space used

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

In the internet television service quality difference root cause investigation process, through the combination of the set top box probe data and the gateway probe data, fault points in a home local area network can be more accurately positioned, the obstacle removal efficiency is improved, historical data can be effectively processed, the data storage pressure is reduced, knowledge is effectively refined, and meanwhile, root cause analysis in various aspects can be carried out.

Description

一种用户视频质差根因分析方法和电子设备和存储介质A root cause analysis method for poor user video quality, electronic device and storage medium

技术领域Technical Field

本发明涉及网络管理技术,特别是一种用户视频质差根因分析方法和电子设备和存储介质。The invention relates to network management technology, in particular to a root cause analysis method for poor quality of user videos, electronic equipment and storage medium.

背景技术Background Art

大量网关都预置了探针,实时监测业务质量,包括但不限硬件信息、软件信息、业务信息、开机启动信息等。在机顶盒上预置探针实时监测视频业务质量,包括但不限于用于播放时长、卡顿信息、节目信息、网络延迟等。互联网电视业务发生质差的原因通常在用户侧,运维人员上门时往往很难直接发现问题出现的位置。另一方面,仅仅使用机顶盒探针情况又很难明确故障点,问题难以解决。目前的方案主要时依赖机顶盒探针,机顶盒探针检测到质差,用自身采集的数据进行判定,但是家庭内其他设备的使用情况又影响着机顶盒的运行,分析难免不够全面。如果将机顶盒数据全部上传至统一平台,难免对平台造成过大压力,另一方面存储和计算压力过大。另一种方案是在用户侧实现机顶盒探针与网关探针的通讯,实时获取家庭网络信息,这种方法对根因的诊断是包含大量的人为经验的,容易出现判断错误的情况。同时数据竟然随着数据的消失逐渐丢失,对整体网络质量的提升没有起到关键作用。A large number of gateways are pre-installed with probes to monitor service quality in real time, including but not limited to hardware information, software information, service information, and startup information. Probes are pre-installed on set-top boxes to monitor video service quality in real time, including but not limited to playback time, freeze information, program information, network delay, etc. The reason for poor quality of Internet TV services is usually on the user side, and it is often difficult for operation and maintenance personnel to directly find the location of the problem when they visit. On the other hand, it is difficult to identify the fault point using only set-top box probes, and the problem is difficult to solve. The current solution mainly relies on set-top box probes. The set-top box probes detect poor quality and use the data collected by themselves to make judgments. However, the use of other devices in the home affects the operation of the set-top box, and the analysis is inevitably not comprehensive enough. If all set-top box data is uploaded to a unified platform, it is inevitable that too much pressure will be placed on the platform, and on the other hand, the storage and computing pressure will be too high. Another solution is to realize communication between the set-top box probe and the gateway probe on the user side to obtain home network information in real time. This method involves a lot of human experience in diagnosing the root cause, and it is easy to make mistakes. At the same time, the data is actually gradually lost as it disappears, which does not play a key role in improving the overall network quality.

用户视频质差的原因多种多样,内容源质量、核心网质量、接入网络质量、家庭网络质量。核心网络、接入网络的质量问题比较好监控,通过关键性能指标的监控能够快速识别劣化情况的发生。而用户侧的问题,因用户数量多、实用场景负责、家用宽带下挂设备多样等原因难以识别具体原因。给质差根因分析带来困难。实时方法缺少知识的沉淀,过度依靠经验结论,经验结论缺乏的问题无法得到有效解决,无法面对新的挑战。上门检测受到时间、地点、使用场景等多种因素的影响,实践困难。据此,本申请试图提出一种方法能够有效的对历史数据进行处理,降低数据存储压力,有效提炼知识,同时能够进行多种方面根因分析。There are many reasons for poor user video quality, including content source quality, core network quality, access network quality, and home network quality. The quality issues of the core network and access network are relatively easy to monitor, and the occurrence of degradation can be quickly identified through the monitoring of key performance indicators. However, it is difficult to identify the specific causes of problems on the user side due to the large number of users, complex practical scenarios, and diverse home broadband devices. This brings difficulties to the root cause analysis of poor quality. Real-time methods lack the accumulation of knowledge and rely too much on empirical conclusions. The problem of lack of empirical conclusions cannot be effectively solved and cannot face new challenges. On-site detection is affected by multiple factors such as time, location, and usage scenarios, and is difficult to implement. Accordingly, this application attempts to propose a method that can effectively process historical data, reduce data storage pressure, effectively extract knowledge, and at the same time perform root cause analysis in multiple aspects.

发明内容Summary of the invention

本发明针对现有技术中存在的缺陷或不足,提供一种用户视频质差根因分析方法和电子设备和存储介质。The present invention aims at the defects or shortcomings in the prior art and provides a root cause analysis method for poor quality of user videos, an electronic device and a storage medium.

本发明的技术解决方案如下:The technical solution of the present invention is as follows:

一种用户视频质差根因分析方法,其特征在于,包括以下步骤:A method for analyzing root causes of poor user video quality, characterized in that it comprises the following steps:

步骤S201,对上报数据进行数据预处理并形成有效数据,所述上报数据包括通过采集用户视频终端运行数据所形成的上报相关数据;Step S201, preprocessing the reported data to form valid data, wherein the reported data includes report-related data formed by collecting the operation data of the user video terminal;

步骤S202,利用投诉数据和所述有效数据进行数据集构造;Step S202, constructing a data set using the complaint data and the valid data;

步骤S203,对所述数据集构造中的数据进行关联规则分析并形成知识库,所述知识库包括具有以下列名的数据表:support支持度,confidence置信度,lift提升度,reason根因,以及result结果。Step S203, performing association rule analysis on the data in the data set construction and forming a knowledge base, the knowledge base includes data tables with the following column names: support, confidence, lift, reason, and result.

所述步骤S201中的终端运行数据包括机顶盒探针数据和网关探针数据,并进行周期上传,所述周期由第一阈值限制,在互联网电视业务质差根因排查过程中,通过机顶盒探针数据和网关探针数据的结合,能够更准确的定位家庭局域网内的故障点,提高排障效率。The terminal operation data in step S201 includes set-top box probe data and gateway probe data, and is uploaded periodically. The period is limited by a first threshold. In the process of troubleshooting the root cause of poor quality of Internet TV services, the combination of set-top box probe data and gateway probe data can more accurately locate the fault point in the home local area network and improve troubleshooting efficiency.

所述终端运行数据包括用户投诉数据和接入网资源树数据,所述终端运行数据通过TCP方式上报到网络服务器。The terminal operation data includes user complaint data and access network resource tree data, and the terminal operation data is reported to the network server via TCP.

所述数据预处理包括过滤数据,所述过滤数据包括将无观看行为的数据过滤掉;所述数据集构造包括将投诉数据与终端运行数据进行关联,筛选出时间差满足第二阈值的用户数据作为投诉数据集,将其他数据作为非投诉数据集,通过AP聚类算法进行不完备数据填充,将填充成功的数据标记为静默质差数据集,至此,共构成三种数据集分别为:投诉数据集,静默质差数据集,非投诉数据集;所述关联规则分析包括对投诉数据集和静默质差数据集中记录构建用户感受指数,并进行离散化,对离散化的数据进行关联规则分析,输出关联规则记录作为知识库数据,所述知识库还包括针对不同根因的优化建议。The data preprocessing includes filtering data, and the filtering data includes filtering out data without viewing behavior; the data set construction includes associating complaint data with terminal operation data, screening out user data with a time difference that meets a second threshold as a complaint data set, and using other data as a non-complaint data set, filling in incomplete data through an AP clustering algorithm, and marking the successfully filled data as a silent poor quality data set. So far, a total of three data sets are constructed, namely: a complaint data set, a silent poor quality data set, and a non-complaint data set; the association rule analysis includes constructing a user perception index for records in the complaint data set and the silent poor quality data set, and discretizing them, performing association rule analysis on the discretized data, and outputting association rule records as knowledge base data. The knowledge base also includes optimization suggestions for different root causes.

包括利用知识库对用户终端的用户视频进行实时诊断方法:用户终端主动实时上报机顶盒探针数据,对实时数据进行离散化处理,比对知识库,返回根因分析结果和优化建议;或者,通过远程拨测用户终端,下发采集命令,上传实时数据,比对知识库,返回根因分析结果和优化建议;或者,运营商进行周期拨测用户终端,获取用户终端的网络质量状态,比对知识库,返回根因分析结果和优化建议。The method includes using a knowledge base to perform real-time diagnosis on user videos of user terminals: the user terminal actively reports set-top box probe data in real time, the real-time data is discretized, compared with the knowledge base, and root cause analysis results and optimization suggestions are returned; or, by remotely dialing the user terminal, a collection command is issued, real-time data is uploaded, compared with the knowledge base, and root cause analysis results and optimization suggestions are returned; or, the operator performs periodic dialing of the user terminal to obtain the network quality status of the user terminal, compares with the knowledge base, and returns the root cause analysis results and optimization suggestions.

所述步骤S202中的数据集构造包括以下步骤:The data set construction in step S202 includes the following steps:

步骤S2021,对资源树、网关探针数据、机顶盒探针数据和投诉数据进行数据关联;Step S2021, data association is performed on the resource tree, gateway probe data, set-top box probe data, and complaint data;

步骤S2022,进行数据分类,分别形成投诉数据集和非投诉数据集;Step S2022, classifying the data to form a complaint data set and a non-complaint data set;

步骤S2023,进行投诉数据集扩充并形成静默质差数据集;Step S2023, expanding the complaint data set and forming a silent poor quality data set;

步骤S2024,利用投诉数据集和静默质差数据集构建用户感受指数;步骤S2025,数据离散化并形成关联分析数据集。Step S2024, constructing a user perception index using the complaint data set and the silent poor quality data set; Step S2025, discretizing the data and forming a correlation analysis data set.

所述步骤S2023中的投诉数据集扩充采用AP聚类算法实现,包括以下步骤:The complaint data set expansion in step S2023 is implemented by using the AP clustering algorithm, including the following steps:

步骤S20231,算法初始化,包括初始化矩阵R,A,S;Step S20231, algorithm initialization, including initialization of matrices R, A, S;

步骤S20232,更新吸引度矩阵R;Step S20232, update the attraction matrix R;

步骤S20233,更新归属度矩阵A;Step S20233, updating the attribution matrix A;

步骤S20234,根据衰减系数进行衰减操作;Step S20234, performing an attenuation operation according to the attenuation coefficient;

步骤S20235,判断是否矩阵稳定或达到最大迭代次数,如果否,则返回步骤S20232,如果是,则进入步骤S20236;Step S20235, determine whether the matrix is stable or the maximum number of iterations has been reached, if not, return to step S20232, if yes, proceed to step S20236;

步骤S20236,选取A+R最大的点作为聚类中心;Step S20236, selecting the point with the largest A+R as the cluster center;

步骤S20237,导出静默质差数据集;Step S20237, exporting a silent poor quality data set;

步骤S20238,更新非投诉数据集。Step S20238, update the non-complaint data set.

一种采用上述用户视频质差根因分析方法的应用网络,其特征在于,包括用户终端,网络服务器,数据库,和知识库,联合执行以下步骤:An application network using the above-mentioned root cause analysis method for poor user video quality is characterized by comprising a user terminal, a network server, a database, and a knowledge base, which jointly perform the following steps:

步骤S401,数据上报;Step S401, data reporting;

步骤S402,分析历史数据;Step S402, analyzing historical data;

步骤S403,知识库辅助实时诊断和优化;Step S403, knowledge base assists in real-time diagnosis and optimization;

步骤S401中包括以下步骤:Step S401 includes the following steps:

步骤S4011,用户终端开机;Step S4011, the user terminal is turned on;

步骤S4012,向网络服务器上报用户终端开机启动数据;Step S4012, reporting user terminal startup data to the network server;

步骤S4013,网络服务器将用户终端开机启动数据存储到数据库;Step S4013, the network server stores the user terminal startup data in the database;

步骤S4014,用户终端周期上报数据到网络服务器;Step S4014, the user terminal periodically reports data to the network server;

步骤S4015,网络服务器将周期上报数据储到数据库;Step S4015, the network server stores the periodic reporting data in a database;

步骤S402中包括以下步骤:Step S402 includes the following steps:

步骤S4021,网络服务器从数据库中抽取历史数据;Step S4021, the network server extracts historical data from the database;

步骤S4022,网络服务器进行关联规则分析,形成知识库;Step S4022, the network server performs association rule analysis to form a knowledge base;

步骤S4023,网络服务器更新和维护知识库;Step S4023, the network server updates and maintains the knowledge base;

步骤S403中包括以下步骤:Step S403 includes the following steps:

步骤S4031,用户终端向网络服务器上报实时测量数据,辅助诊断;Step S4031, the user terminal reports real-time measurement data to the network server to assist diagnosis;

步骤S4032,网络服务器从知识库抽取相关知识;Step S4032, the network server extracts relevant knowledge from the knowledge base;

步骤S4033,网络服务器计算质差原因;Step S4033, the network server calculates the reason for the poor quality;

步骤S4034,网络服务器向用户终端下发诊断结果和优化方案;Step S4034, the network server sends the diagnosis result and optimization plan to the user terminal;

步骤S4035,用户终端向网络服务器反馈优化后效果;Step S4035, the user terminal feeds back the optimization effect to the network server;

步骤S4036,网络服务器更新和维护知识库。Step S4036, the network server updates and maintains the knowledge base.

一种电子设备,其特征在于,包括:存储器、处理器、接收器、显示器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序以执行上述用户视频质差根因分析方法。An electronic device, characterized in that it includes: a memory, a processor, a receiver, a display and a computer program, wherein the computer program is stored in the memory, and the processor runs the computer program to execute the above-mentioned user video quality poor root cause analysis method.

一种存储介质,其特征在于,包括:可读存储介质和存储在所述可读存储介质中的计算机程序,所述计算机程序用于实现上述用户视频质差根因分析方法。A storage medium, characterized in that it comprises: a readable storage medium and a computer program stored in the readable storage medium, wherein the computer program is used to implement the above-mentioned method for analyzing the root causes of poor user video quality.

本发明的技术效果如下:本发明一种用户视频质差根因分析方法和电子设备和存储介质,在互联网电视业务质差根因排查过程中,通过机顶盒探针数据和网关探针数据的结合,能够更准确的定位家庭局域网内的故障点,提高排障效率,能够有效地对历史数据进行处理,降低数据存储压力,有效提炼知识,同时能够进行多种方面根因分析。The technical effects of the present invention are as follows: the present invention provides a method for analyzing the root causes of poor video quality of users, an electronic device and a storage medium. In the process of troubleshooting the root causes of poor quality of Internet TV services, the combination of set-top box probe data and gateway probe data can more accurately locate the fault point in the home local area network, improve troubleshooting efficiency, effectively process historical data, reduce data storage pressure, effectively extract knowledge, and simultaneously perform root cause analysis in multiple aspects.

本发明的特点如下:1、离线分析。本申请通过分析历史数据,导出知识库的方式,将多种根因进行总结,实现离线分析。2、在线实时查询。通过实时上传机顶盒探针数据,进行实时诊断,诊断准确快速,效率更高。3、多种诊断方式可以同时进行。支持用户终端主动诊断、装维人员主动专断、运营商周期诊断等多种诊断方式。4、支持知识库的更新维护。多种更新维护方式,随着数据的增加准确性不断提升。The features of the present invention are as follows: 1. Offline analysis. This application summarizes multiple root causes by analyzing historical data and deriving a knowledge base to achieve offline analysis. 2. Online real-time query. By uploading set-top box probe data in real time, real-time diagnosis is performed, and the diagnosis is accurate, fast, and more efficient. 3. Multiple diagnostic methods can be performed simultaneously. It supports multiple diagnostic methods such as active diagnosis of user terminals, active diagnosis of installation and maintenance personnel, and periodic diagnosis of operators. 4. It supports updating and maintenance of the knowledge base. With multiple update and maintenance methods, the accuracy continues to improve as the data increases.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是实施本发明一种用户视频质差根因分析方法涉及的数据采集上报流程示意图。图1中包括步骤S101,采集终端运行数据;步骤S102,上报相关数据。Fig. 1 is a schematic diagram of a data collection and reporting process involved in implementing a method for analyzing the root causes of poor video quality of a user according to the present invention. Fig. 1 includes step S101, collecting terminal operation data; and step S102, reporting relevant data.

图2是实施本发明一种用户视频质差根因分析方法流程示意图。图2中包括步骤S201,对上报数据进行数据预处理并形成有效数据;步骤S202,利用投诉数据和所述有效数据进行数据集构造;步骤S203,进行关联规则分析并形成知识库(或知识数据库)。Fig. 2 is a schematic flow chart of a method for analyzing the root causes of poor video quality of a user according to the present invention. Fig. 2 includes step S201, preprocessing the reported data and forming valid data; step S202, constructing a data set using the complaint data and the valid data; and step S203, performing association rule analysis and forming a knowledge base (or knowledge database).

图3是图2中步骤S201中的数据预处理流程示意图。图3中包括步骤S2011,过滤机顶盒探针数据;步骤S2012,关联机顶盒探针数据并形成有效数据;步骤S2013,过滤网关探针数据;步骤S2014,关联网关探针数据并形成有效数据。Fig. 3 is a schematic diagram of the data preprocessing process in step S201 in Fig. 2. Fig. 3 includes step S2011, filtering set-top box probe data; step S2012, associating set-top box probe data and forming valid data; step S2013, filtering gateway probe data; step S2014, associating gateway probe data and forming valid data.

图4是图2中步骤S202的数据集构造流程示意图。图4中包括步骤S2021,对资源树、网关探针数据、机顶盒探针数据和投诉数据进行数据关联;步骤S2022,进行数据分类,分别形成投诉数据集和非投诉数据集;步骤S2023,进行投诉数据集扩充并形成静默质差数据集;步骤S2024,利用投诉数据集和静默质差数据集构建用户感受指数;步骤S2025,数据离散化并形成关联分析数据集。Fig. 4 is a schematic diagram of the data set construction process of step S202 in Fig. 2. Fig. 4 includes step S2021, data association of resource tree, gateway probe data, set-top box probe data and complaint data; step S2022, data classification to form complaint data set and non-complaint data set respectively; step S2023, expansion of complaint data set and formation of silent poor quality data set; step S2024, construction of user perception index using complaint data set and silent poor quality data set; step S2025, data discretization and formation of association analysis data set.

图5是图4中步骤S2023中投诉数据集扩充采用AP聚类算法实现的流程示意图。图5中包括步骤S20231,算法初始化,包括初始化矩阵R,A,S;步骤S20232,更新吸引度矩阵R;步骤S20233,更新归属度矩阵A;步骤S20234,根据衰减系数进行衰减操作;步骤S20235,判断是否矩阵稳定或达到最大迭代次数,如果否,则返回步骤S20232,如果是,则进入步骤S20236;步骤S20236,选取A+R最大的点作为聚类中心;步骤S20237,导出静默质差数据集;步骤S20238,更新非投诉数据集。Fig. 5 is a flowchart of the implementation of the AP clustering algorithm in step S2023 of Fig. 4 to expand the complaint data set. Fig. 5 includes step S20231, algorithm initialization, including initialization matrices R, A, S; step S20232, updating the attraction matrix R; step S20233, updating the attribution matrix A; step S20234, performing an attenuation operation according to the attenuation coefficient; step S20235, judging whether the matrix is stable or has reached the maximum number of iterations, if not, returning to step S20232, if yes, entering step S20236; step S20236, selecting the point with the largest A+R as the cluster center; step S20237, exporting the silent poor quality data set; step S20238, updating the non-complaint data set.

图6是用户视频质差根因实时诊断流程示意图。图6中包括步骤S301,用户终端发出请求;步骤S302,计算用户感受指数;步骤S303,对上报数据进行离散化;步骤S304,查询知识库;步骤S305,返回根因,并返回相关建议。Figure 6 is a schematic diagram of the real-time diagnosis process of the root cause of poor video quality of users. Figure 6 includes step S301, the user terminal sends a request; step S302, calculate the user experience index; step S303, discretize the reported data; step S304, query the knowledge base; step S305, return the root cause and return relevant suggestions.

图7是基于网络应用本发明一种用户视频质差根因分析方法的总体工作流程示意图。图7中包括用户终端,网络服务器,数据库,和知识库;图7中包括步骤S401,数据上报;步骤S402,分析历史数据;步骤S403,知识库辅助实时诊断和优化。步骤S401中包括步骤S4011,用户终端开机;步骤S4012,向网络服务器上报用户终端开机启动数据;步骤S4013,网络服务器将用户终端开机启动数据存储到数据库;步骤S4014,用户终端周期上报数据到网络服务器;步骤S4015,网络服务器将周期上报数据储到数据库。步骤S402中包括步骤S4021,网络服务器从数据库中抽取历史数据;步骤S4022,网络服务器进行关联规则分析,形成知识库;步骤S4023,网络服务器更新和维护知识库。步骤S403中包括步骤S4031,用户终端向网络服务器上报实时测量数据,辅助诊断;步骤S4032,网络服务器从知识库抽取相关知识;步骤S4033,网络服务器计算质差原因;步骤S4034,网络服务器向用户终端下发诊断结果和优化方案;步骤S4035,用户终端向网络服务器反馈优化后效果;步骤S4036,网络服务器更新和维护知识库。FIG7 is a schematic diagram of the overall workflow of a method for analyzing the root causes of poor video quality of the present invention based on network application. FIG7 includes a user terminal, a network server, a database, and a knowledge base; FIG7 includes step S401, data reporting; step S402, analyzing historical data; step S403, knowledge base assists in real-time diagnosis and optimization. Step S401 includes step S4011, user terminal power-on; step S4012, reporting user terminal power-on startup data to the network server; step S4013, the network server stores the user terminal power-on startup data in the database; step S4014, the user terminal periodically reports data to the network server; step S4015, the network server stores the periodically reported data in the database. Step S402 includes step S4021, the network server extracts historical data from the database; step S4022, the network server performs association rule analysis to form a knowledge base; step S4023, the network server updates and maintains the knowledge base. Step S403 includes step S4031, the user terminal reports real-time measurement data to the network server to assist in diagnosis; step S4032, the network server extracts relevant knowledge from the knowledge base; step S4033, the network server calculates the cause of poor quality; step S4034, the network server sends the diagnosis results and optimization plan to the user terminal; step S4035, the user terminal feeds back the optimization effect to the network server; step S4036, the network server updates and maintains the knowledge base.

图8是知识库数据样例示意图。图8中第1列为表内序号,第2列为support支持度,第3列为confidence置信度,第4列为lift提升度,第5列为reason原因或根因,第6列为result结果。Figure 8 is a schematic diagram of a knowledge base data sample. In Figure 8, the first column is the sequence number in the table, the second column is the support, the third column is the confidence, the fourth column is the lift, the fifth column is the reason or root cause, and the sixth column is the result.

具体实施方式DETAILED DESCRIPTION

下面结合附图(图1-图8)和实施例对本发明进行说明。The present invention is described below in conjunction with the accompanying drawings (FIG. 1-FIG. 8) and embodiments.

本发明提出一种用户视频质差根因分析方法和装置、用户终端、网络服务器。在互联网电视业务质差根因排查过程中,通过机顶盒探针数据和网关探针数据的结合,能够更准确的定位家庭局域网内的故障点,提高排障效率。为了解决上述问题,本发明的实施例提供技术方案如下:The present invention proposes a method and device for analyzing the root cause of poor video quality, a user terminal, and a network server. In the process of troubleshooting the root cause of poor quality of Internet TV services, the combination of set-top box probe data and gateway probe data can more accurately locate the fault point in the home local area network and improve troubleshooting efficiency. In order to solve the above problems, the embodiments of the present invention provide the following technical solutions:

一种数据反馈装置位于用于用户终端,包括:采集终端运行数据。所述终端运行数据包括机顶盒探针数据和网关探针数据。并进行周期上传,所述周期由第一阈值限制。A data feedback device is located in a user terminal, comprising: collecting terminal operation data, including set-top box probe data and gateway probe data, and periodically uploading the data, wherein the period is limited by a first threshold.

将终端运行数据通过TCP方式上报到服务器。Report terminal operation data to the server via TCP.

特别的,还应该包括用户投诉数据和接入网资源树数据。In particular, it should also include user complaint data and access network resource tree data.

一种分析用户视频质差根因的方法,用于网络服务器,包括:对所述终端运行数据进行预处理,将数据进行过滤,将无观看行为的数据进行过滤。将投诉数据与终端运行数据进行关联,筛选出,时间差满足第二阈值的用户数据作为投诉数据集,将其他数据作为非投诉数据集。通过AP聚类算法进行不完备数据填充,将填充成功的数据标记为静默质差数据集。至此,共构成三种数据集,分别为:投诉数据集、静默质差数据集、非投诉数据集。A method for analyzing the root cause of poor video quality of users, used in a network server, comprising: preprocessing the terminal operation data, filtering the data, and filtering the data without viewing behavior. Associating the complaint data with the terminal operation data, screening out the user data whose time difference meets the second threshold as the complaint data set, and taking the other data as the non-complaint data set. Filling the incomplete data through the AP clustering algorithm, marking the successfully filled data as the silent poor quality data set. So far, three data sets are constituted, namely: complaint data set, silent poor quality data set, and non-complaint data set.

对投诉数据集和静默质差数据集中记录构建用户感受指数,并进行离散化。对离散化的数据进行关联规则分析,输出关联规则记录作为知识库。The user perception index is constructed for the records in the complaint data set and the silent quality-poor data set, and discretized. The discretized data is analyzed for association rules, and the association rule records are output as a knowledge base.

特别地,所述知识库还应该包括针对不同根因的优化建议。In particular, the knowledge base should also include optimization suggestions for different root causes.

一种实时诊断方法,用于用户终端,包括:用户端主动实时上报机顶盒探针数据,对实时数据进行离散化处理,比对知识库,返回根因分析结果和优化建议。或,通过远程拨测用户终端,下发采集命令,上传实时数据进行根因分析。或,运营商进行周期拨测用户终端,获取用户终端的网络质量状态。A real-time diagnosis method for a user terminal includes: the user terminal actively reports set-top box probe data in real time, discretizes the real-time data, compares it with a knowledge base, and returns root cause analysis results and optimization suggestions. Alternatively, by remotely dialing the user terminal, issuing a collection command, and uploading real-time data for root cause analysis. Alternatively, the operator periodically dials the user terminal to obtain the network quality status of the user terminal.

图1中包括实施例1流程,具体步骤如下:S101:采集终端运行数据。所述终端运行数据包括机顶盒探针数据和网关探针数据。FIG1 includes the process of Embodiment 1, and the specific steps are as follows: S101: Collect terminal operation data. The terminal operation data includes set-top box probe data and gateway probe data.

进一步地,所述机顶盒探针数据包括,机顶盒启动数据、机顶盒硬件数据、机顶盒软件数据、机顶盒网络数据、机顶盒业务数据、机顶盒运行数据。Furthermore, the set-top box probe data includes set-top box startup data, set-top box hardware data, set-top box software data, set-top box network data, set-top box service data, and set-top box operation data.

所述机顶盒启动数据包括:机顶盒开机时长、机顶盒上次运行时长。The set-top box startup data includes: the set-top box startup time and the set-top box last operation time.

所述机顶盒硬件数据包括:机顶盒厂商、牌照方。The set-top box hardware data includes: set-top box manufacturer and licensee.

所述机顶盒软件数据包括:探针提供商、探针版本、系统版本信息、机顶盒固件版本、CPU型号。The set-top box software data includes: probe provider, probe version, system version information, set-top box firmware version, and CPU model.

所述机顶盒网络数据包括:网络连接方式、m3u8文件请求成功率、媒体文件请求成功率、EPG请求成功率、m3u8文件请求响应平均延时、媒体文件请求响应平均延时、EPG请求响应平均延时、TCP连接成功率、TCP建立连接平均时长、TCP平均重传率、如果网络连接方式为Wifi则还应该包括:无线热点的带宽、无线网络信号强度。其中EPG(Electronic ProgramGuide)为电子节目菜单。m3u8为一种文件格式,TCP是传输控制协议。The set-top box network data includes: network connection mode, m3u8 file request success rate, media file request success rate, EPG request success rate, m3u8 file request response average delay, media file request response average delay, EPG request response average delay, TCP connection success rate, TCP connection average duration, TCP average retransmission rate, if the network connection mode is Wifi, it should also include: the bandwidth of wireless hotspots, wireless network signal strength. Wherein EPG (Electronic Program Guide) is an electronic program menu. M3u8 is a file format, and TCP is a transmission control protocol.

所述机顶盒业务数据包括:安装的应用包名列表、观看节目列表、访问TCP资源列表、m3u8请求失败列表、媒体文件请求失败列表、EPG请求失败列表、TCP请求失败列表。The set-top box service data includes: an installed application package name list, a watched program list, an accessed TCP resource list, an m3u8 request failure list, a media file request failure list, an EPG request failure list, and a TCP request failure list.

所述机顶盒运行数据包括:用户观看时长、卡顿次数、卡顿总时长、机顶盒持续运行时长、牌照方应用是否在运行、CPU占用率、内存使用率。The set-top box operation data includes: user viewing time, number of freezes, total freeze duration, continuous operation time of the set-top box, whether the licensee's application is running, CPU occupancy, and memory usage.

进一步地,所述网关探针数据包括,网关硬件数据、网关软件数据、网关网络数据、网关业务数据、网关运行数据、网关下挂设备数据。Furthermore, the gateway probe data includes gateway hardware data, gateway software data, gateway network data, gateway service data, gateway operation data, and gateway downstream device data.

所述网关硬件数据包括:CPU型号、硬件版本号、网关型号、接受光功率、发射光功率。The gateway hardware data includes: CPU model, hardware version number, gateway model, received optical power, and transmitted optical power.

所述网关软件数据包括:软件版本号、软探针中间件版本号、接口协议版本号。The gateway software data includes: software version number, soft probe middleware version number, and interface protocol version number.

所述网关网络数据包括:TCP连接成功率、TCP重传率、HTTP响应时延、HTTP下载速率、HTTP请求成功率。The gateway network data includes: TCP connection success rate, TCP retransmission rate, HTTP response delay, HTTP download rate, and HTTP request success rate.

所述网关业务数据包括:上行流量周期均值、下行流量周期均值、上行流量、下行流量、上行峰值速率、下行峰值速率、协议类型、业务类型、周边WiFi数量、WiFi信道。The gateway service data includes: uplink traffic period average, downlink traffic period average, uplink traffic, downlink traffic, uplink peak rate, downlink peak rate, protocol type, service type, number of surrounding WiFi, and WiFi channel.

所述网关运行数据包括:网关运行时长、CPU占有率、内存占有率。The gateway operation data includes: gateway operation time, CPU occupancy, and memory occupancy.

所述网关下挂设备数据包括:下挂设备数量、下挂设备MAC地址列表、下挂设备上行流量周期均值列表、下挂设备下行流量周期均值列表、上行流量列表、下行流量列。下挂设备是一组数据,数量决定于下挂设备数量。The gateway downstream device data includes: the number of downstream devices, the MAC address list of downstream devices, the periodic average list of upstream traffic of downstream devices, the periodic average list of downstream traffic of downstream devices, the upstream traffic list, and the downstream traffic list. The downstream devices are a group of data, and the number is determined by the number of downstream devices.

特别的,所述机顶盒探针数据还应包括机顶盒账户、机顶盒唯一标识、机顶盒MAC地址。网关探针数据还应该包括网关账户和网关唯一标识。In particular, the set-top box probe data should also include a set-top box account, a set-top box unique identifier, and a set-top box MAC address. The gateway probe data should also include a gateway account and a gateway unique identifier.

S102:上传相关数据。所述网关探针数据与机顶盒探针数据以TCP方式上报。S102: Upload relevant data. The gateway probe data and the set-top box probe data are reported in TCP mode.

进一步的,所述机顶盒启动数据、机顶盒硬件信息、机顶盒软件数据启动后上传一次;所诉机顶盒网络数据、机顶盒业务数据、机顶盒运行数据周期上报。Furthermore, the set-top box startup data, set-top box hardware information, and set-top box software data are uploaded once after startup; the set-top box network data, set-top box service data, and set-top box operation data are reported periodically.

进一步的,所述网关硬件信息、网关软件数据启动后上传一次;所诉网关网络数据、网关业务数据、网关运行数据、网关下挂设备数据周期上报。Furthermore, the gateway hardware information and gateway software data are uploaded once after startup; the gateway network data, gateway service data, gateway operation data, and gateway downstream device data are reported periodically.

进一步地,所需要的数据还需要包括网络告警数据和用户投诉数据。Furthermore, the required data also needs to include network alarm data and user complaint data.

进一步地,网关数据与机顶盒数据的采集周期应当保持一致。Furthermore, the collection period of the gateway data and the set-top box data should be consistent.

进一步地,周期上报数据的采集周期为第一阈值,本实施例中第一阈值设置为15分钟。Furthermore, the collection period of the periodically reported data is a first threshold value. In this embodiment, the first threshold value is set to 15 minutes.

进一步地,还应该包含接入网资源树数据。Furthermore, it should also include access network resource tree data.

图2包括实施例2流程,具体步骤如下:S201:数据预处理。所述上报数据的内容如实施例1。对所述上报数据的处理过程如图3所示。FIG2 includes the flow of Example 2, and the specific steps are as follows: S201: Data preprocessing. The content of the reported data is as in Example 1. The processing process of the reported data is shown in FIG3.

S2011:过滤机顶盒探针数据。过滤规则为:用户观看时长小于等于0的数据、网络连接方式未知。过滤出满足任意一个过滤规则的数据。S2011: Filter the set-top box probe data. The filtering rules are: data with user viewing time less than or equal to 0, and unknown network connection mode. Filter out the data that meets any of the filtering rules.

S2012:关联机顶盒数据。对机顶盒启动数据、机顶盒硬件数据、机顶盒软件数据、机顶盒网络数据、机顶盒业务数据、机顶盒运行数据进行关联,按照第一阈值为粒度。进行关联,构造宽表。S2012: Associating set-top box data. Associating set-top box startup data, set-top box hardware data, set-top box software data, set-top box network data, set-top box service data, and set-top box operation data, using the first threshold as the granularity, and constructing a wide table.

机顶盒探针数据经过关联后形成如下的格式:After association, the set-top box probe data is in the following format:

YSTB={y1,y2,···,y34}Y STB = {y 1 , y 2 ,···,y 34 }

YSTB共含有34个字段,依次序分别为:机顶盒厂商、牌照方、探针提供商、探针版本、系统版本信息、机顶盒固件版本、CPU型号、牌照方应用是否在运行、机顶盒开机时长、机顶盒上次运行时长、m3u8文件请求成功率、媒体文件请求成功率、EPG请求成功率、m3u8文件请求响应平均延时、媒体文件请求响应平均延时、EPG请求响应平均延时、TCP连接成功率、TCP建立连接平均时长、TCP平均重传率、用户观看时长、卡顿次数、卡顿总时长、机顶盒持续运行时长、CPU占用率、内存使用率、安装的应用包名列表、观看节目列表、访问TCP资源列表、m3u8请求失败列表、媒体文件请求失败列表、EPG请求失败列表,TCP请求失败列表、网络连接方式、无线网络信号强度。Y STB contains a total of 34 fields, in order: set-top box manufacturer, licensee, probe provider, probe version, system version information, set-top box firmware version, CPU model, whether the licensee's application is running, set-top box startup time, set-top box last running time, m3u8 file request success rate, media file request success rate, EPG request success rate, m3u8 file request response average delay, media file request response average delay, EPG request response average delay, TCP connection success rate, TCP connection establishment average time, TCP average retransmission rate, user viewing time, number of freezes, total freeze time, set-top box continuous operation time, CPU occupancy, memory usage, installed application package name list, watched program list, accessed TCP resource list, m3u8 request failure list, media file request failure list, EPG request failure list, TCP request failure list, network connection method, wireless network signal strength.

S2013:过滤网关探针数据。过滤规则为:下挂设备为空、存在空值的数据、S2013: Filter the gateway probe data. The filtering rules are: the downstream device is empty, the data with empty value,

S2014:关联网关探针数据。对网关硬件数据、网关软件数据、网关网络数据、网关业务数据、网关运行数据、网关下挂设备数据进行关联。S2014: Associating gateway probe data. Associating gateway hardware data, gateway software data, gateway network data, gateway service data, gateway operation data, and gateway attached device data.

网关探针数据经过关联后形成如下的格式:After the gateway probe data is associated, it forms the following format:

XHGU={x1,x2,···,x32}X HGU = {x 1 , x 2 ,···, x 32 }

XHGU共含有32个字段,依次序分别为:CPU型号、硬件版本号、网关型号、软件版本号、软探针中间件版本号、接口协议版本号、协议类型、业务类型、WiFi信道、接受光功率、发射光功率、TCP连接成功率、TCP重传率、HTTP响应时延、HTTP下载速率、HTTP请求成功率、上行流量周期均值、下行流量周期均值、上行流量、下行流量、上行峰值速率、下行峰值速率、网关运行时长、CPU占有率、内存占有率、周边WiFi数量、下挂设备数量、下挂设备MAC地址列表、下挂设备上行流量周期均值列表、下挂设备下行流量周期均值列表、上行流量列表、下行流量列表。X HGU contains 32 fields in total, in order: CPU model, hardware version number, gateway model, software version number, soft probe middleware version number, interface protocol version number, protocol type, service type, WiFi channel, receiving optical power, transmitting optical power, TCP connection success rate, TCP retransmission rate, HTTP response delay, HTTP download rate, HTTP request success rate, upstream traffic cycle average, downstream traffic cycle average, upstream traffic, downstream traffic, upstream peak rate, downstream peak rate, gateway running time, CPU occupancy, memory occupancy, number of surrounding WiFi, number of connected devices, MAC address list of connected devices, upstream traffic cycle average list of connected devices, downstream traffic cycle average list of connected devices, upstream traffic list, and downstream traffic list.

S202:数据集构造。数据集构造的流程如图4所示。S202: Dataset construction. The process of data set construction is shown in FIG4 .

S2021:数据关联。将网关探针数据、机顶盒探针数据、投诉数据进行关联。特别地,由于网关探针数据、机顶盒探针数据、投诉数据所使用的唯一用户标识不同,需要资源树数据参与关联。所述资源树数据至少包括:用户的网关唯一标识、机顶盒唯一标识、投诉账户唯一标识。经过数据关联后,任意一条记录L都具有如下的数据格式:S2021: Data association. Associate the gateway probe data, set-top box probe data, and complaint data. In particular, since the unique user identifiers used by the gateway probe data, set-top box probe data, and complaint data are different, resource tree data is required to participate in the association. The resource tree data includes at least: the user's gateway unique identifier, set-top box unique identifier, and complaint account unique identifier. After data association, any record L has the following data format:

L={XHGU,YSTB,ZTS}L = {X HGU , Y STB , Z TS }

特别地,XIHGU如S2014所述,YSTB如S2012所述,其中ZTS包含三种情况,如下所示。In particular, X IHGU is as described in S2014, YSTB is as described in S2012, and Z TS contains three cases as shown below.

S2022:数据分类。将存在投诉行为的数据进行抽取。所述投诉行为的定义为投诉事件与发生机顶盒观看行为的时间差小于第二阈值。实施例二中第二阈值为120小时。第二阈值用于控制用户投诉行为的延迟。将有投诉行为的数据作为投诉数据集。其他数据作为非投诉数据集。S2022: Data classification. Extract data with complaint behavior. The complaint behavior is defined as the time difference between the complaint event and the occurrence of the set-top box viewing behavior is less than the second threshold. In Example 2, the second threshold is 120 hours. The second threshold is used to control the delay of user complaint behavior. The data with complaint behavior is used as the complaint data set. The other data is used as the non-complaint data set.

S2023:投诉数据集扩充。数据集的扩充采用AP聚类算法实现,流程如图5所示。S2023: Complaint dataset expansion: The dataset expansion is implemented using the AP clustering algorithm, and the process is shown in Figure 5.

S20231:算法初始化。采用现有的投诉数据集作为初始质心,也可称作聚类中心点。并初始化吸引度矩阵R和归属度矩阵A为0矩阵。初始化相似度矩阵S,相似度矩阵的初始化方式如下:S20231: Algorithm initialization. Use the existing complaint data set as the initial centroid, which can also be called the cluster center point. Initialize the attraction matrix R and the attribution matrix A to 0 matrices. Initialize the similarity matrix S. The similarity matrix is initialized as follows:

实施例一所述的数据具有多种类型,为了计算不同类型的数据相似度,需要对数据类型进行分类,所述相似度计算均使用的经过归一化处理的数据:The data described in the first embodiment has multiple types. In order to calculate the similarity of different types of data, the data types need to be classified. The similarity calculation uses normalized data:

(1)机顶盒探针数据共分为四类:(1) Set-top box probe data is divided into four categories:

第一数据类型包括:机顶盒厂商、牌照方、探针提供商、探针版本、系统版本信息、机顶盒固件版本、CPU型号、牌照方应用是否在运行。The first data type includes: set-top box manufacturer, licensee, probe provider, probe version, system version information, set-top box firmware version, CPU model, and whether the licensee's application is running.

第二数据类型包括:机顶盒开机时长、机顶盒上次运行时长、m3u8文件请求成功率、媒体文件请求成功率、EPG请求成功率、m3u8文件请求响应平均延时、媒体文件请求响应平均延时、EPG请求响应平均延时、TCP连接成功率、TCP建立连接平均时长、TCP平均重传率、用户观看时长、卡顿次数、卡顿总时长、机顶盒持续运行时长、CPU占用率、内存使用率。The second data type includes: the length of time the set-top box is turned on, the length of time the set-top box was last running, the success rate of m3u8 file requests, the success rate of media file requests, the success rate of EPG requests, the average delay in responding to m3u8 file requests, the average delay in responding to media file requests, the average delay in responding to EPG requests, the success rate of TCP connection, the average time it takes to establish a TCP connection, the average TCP retransmission rate, the length of time the user watches the video, the number of freezes, the total length of time the set-top box is running continuously, the CPU occupancy rate, and the memory usage rate.

第三数据类型包括:安装的应用包名列表、观看节目列表、访问TCP资源列表、m3u8请求失败列表、媒体文件请求失败列表、EPG请求失败列表、TCP请求失败列表。The third data type includes: installed application package name list, watched program list, accessed TCP resource list, m3u8 request failure list, media file request failure list, EPG request failure list, and TCP request failure list.

第四数据类型包括:网络连接方式、无线网络信号强度。The fourth data type includes: network connection mode and wireless network signal strength.

(2)网关探针数据共分为四类:(2) Gateway probe data is divided into four categories:

第五数据类型包括:CPU型号、硬件版本号、网关型号、软件版本号、软探针中间件版本号、接口协议版本号、协议类型、业务类型、WiFi信道。The fifth data type includes: CPU model, hardware version number, gateway model, software version number, soft probe middleware version number, interface protocol version number, protocol type, service type, and WiFi channel.

第六数据类型包括:接受光功率、发射光功率、TCP连接成功率、TCP重传率、HTTP响应时延、HTTP下载速率、HTTP请求成功率、上行流量周期均值、下行流量周期均值、上行流量、下行流量、上行峰值速率、下行峰值速率、网关运行时长、CPU占有率、内存占有率。The sixth data type includes: received optical power, transmitted optical power, TCP connection success rate, TCP retransmission rate, HTTP response delay, HTTP download rate, HTTP request success rate, upstream traffic cycle average, downstream traffic cycle average, upstream traffic, downstream traffic, upstream peak rate, downstream peak rate, gateway operating time, CPU occupancy, and memory occupancy.

第七数据类型包括:周边WiFi数量。The seventh data type includes: the number of surrounding WiFi networks.

第八数据类型包括:下挂设备数量、下挂设备MAC地址列表、下挂设备上行流量周期均值列表、下挂设备下行流量周期均值列表、上行流量列表、下行流量列表。The eighth data type includes: the number of downstream devices, the MAC address list of downstream devices, the periodic average list of upstream traffic of downstream devices, the periodic average list of downstream traffic of downstream devices, the upstream traffic list, and the downstream traffic list.

(3)对于数据集中的任意第一数据样例A和第二数据样例B,第一数据类型至第八数据类型的相似度计算方式为:(3) For any first data sample A and any second data sample B in the data set, the similarity calculation method of the first data type to the eighth data type is:

第一数据类型其计算相似度的方法如下所示:The method for calculating the similarity of the first data type is as follows:

其中my为第一权重,本实施例中my={0.2,0.1,0.2,0.1,0.1,0.1,0.1,0.1},f(a,b)的计算方式如下:Wherein my is the first weight. In this embodiment, my={0.2, 0.1, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1}, and f(a, b) is calculated as follows:

第二数据类型其计算相似度的方法如下所示:The method for calculating the similarity of the second data type is as follows:

第三数据类型其计算相似度的方法如下所示:The method for calculating the similarity of the third data type is as follows:

第三数据类型中每个量都是列表形式存在的,num()表示计算列表中项的数量,∪表示对两个列表求并集,∩表示对两个列表求交集,w为第二权重,本实施例中w=(0.08,0.08,0.08,0.19,0.19,0.19,0.19)。Each quantity in the third data type is in the form of a list, num() means calculating the number of items in the list, ∪ means finding the union of two lists, ∩ means finding the intersection of two lists, w is the second weight, and in this embodiment w = (0.08, 0.08, 0.08, 0.19, 0.19, 0.19, 0.19).

第四数据类型计算相似度Dis4的方法为:如果网络连接方式相同且均不为Wifi则,Dis4=0;如果网络连接方式均为非Wifi且不同,则Dis4=1;如果网络连接方式均为Wifi,则Dis4定义两者为无线网络信号强度之差的绝对值。The method for calculating the similarity Dis 4 of the fourth data type is: if the network connection modes are the same and neither is Wifi, Dis 4 =0; if the network connection modes are both non-Wifi and different, Dis 4 =1; if the network connection modes are both Wifi, Dis 4 is defined as the absolute value of the difference in wireless network signal strength.

第五数据类型其计算相似度的方法如下所示:The method for calculating the similarity of the fifth data type is as follows:

其中mx为第三权重,本实施例中mx={0.05,0.05,0.1,0.05,0.05,0.2,0.2,0.1,0.2},f(a,b)的计算方式同第一数据类型。Wherein m x is the third weight. In this embodiment, m x ={0.05, 0.05, 0.1, 0.05, 0.05, 0.2, 0.2, 0.1, 0.2}, and the calculation method of f(a, b) is the same as that of the first data type.

第六数据类型其计算相似度的方法如下所示:The method for calculating the similarity of the sixth data type is as follows:

第七数据类型计算相似度Dis7的方法为:如果机顶盒数据中网络连接方式均为非WIFI方式,Dis7=0;如果均为机顶盒数据中网络连接方式为WIFI,则其计算方式Dis7=|xAi-xBi|,i=26;如果一个为WIFI方式,另一个为非WIFI方式,怎Dis7=1。The method for calculating the similarity Dis 7 of the seventh data type is: if the network connection methods in the set-top box data are all non-WIFI methods, Dis 7 =0; if the network connection methods in the set-top box data are all WIFI, then the calculation method Dis 7 =|x Ai -x Bi |, i=26; if one is WIFI and the other is non-WIFI, then Dis 7 =1.

第八数据类型其计算形似度的方式如下所示:The eighth data type calculates the similarity as follows:

使用下挂设备MAC地址匹配到机顶盒,其为网关下挂设备MAC地址列表第k个,则相似度表示如下所示:Use the MAC address of the downstream device to match the set-top box, which is the kth in the MAC address list of the gateway downstream device. The similarity is expressed as follows:

(4)其相似度矩阵S的值为:(4) The value of its similarity matrix S is:

SAB=-(1+Dis1+Dis5)×(1+Dis2+Dis4+Dis7+Dis8)×(Dis3+Dis6)S AB =-(1+Dis 1 +Dis 5 )×(1+Dis 2 +Dis 4 +Dis 7 +Dis 8 )×(Dis 3 +Dis 6 )

S20232:更新吸引度矩阵。更新公式如下:S20232: Update the attraction matrix. The update formula is as follows:

S20233:更新归属度矩阵。更新公式如下:S20233: Update the attribution matrix. The update formula is as follows:

S20234:根据衰减系数进行衰减操作。衰减系数λ为第一可调参数。本实施例中第一可调参数λ=0.5。衰减操作的公式如下:S20234: Perform an attenuation operation according to the attenuation coefficient. The attenuation coefficient λ is a first adjustable parameter. In this embodiment, the first adjustable parameter λ=0.5. The formula for the attenuation operation is as follows:

Rt+1(i,k)=λRt(i,k)+(1-λ)Rt+1(i,k)R t+1 (i, k)=λR t (i, k)+(1-λ)R t+1 (i, k)

At+1(i,k)=λAt(i,k)+(1-λ)At+1(i,k)A t+1 (i,k)=λA t (i,k)+(1-λ)A t+1 (i,k)

S20235:重复S20232至S20234步骤,直至矩阵稳定或者达到最大迭代次数,算法结束。S20235: Repeat steps S20232 to S20234 until the matrix is stable or the maximum number of iterations is reached, and the algorithm ends.

S20236:选取A+R最大的点作为聚类中心。S20236: Select the point with the largest A+R as the cluster center.

S20237:导出静默质差数据集。将满足如下条件的数据导出作为静默质差数据集。导出条件:簇内含有至少一个投诉用户数据,导出其他无投诉数据作为静默质差数据集。S20237: Export the silent poor quality data set. Export the data that meets the following conditions as the silent poor quality data set. Export condition: The cluster contains at least one complaint user data, and the other non-complaint data is exported as the silent poor quality data set.

S20238:更新非投诉数据集。将非投诉数据集中标记为静默质差的数据删除。S20238: Update the non-complaint data set. Delete the data marked as silent poor quality in the non-complaint data set.

至此形成三个数据集:投诉数据集、非投诉数据集、静默质差数据集。So far, three data sets are formed: complaint data set, non-complaint data set, and silent poor quality data set.

S2024:构建用户感受指数。为了更好的描述用户的实际体验,构建用户感受指数,增加用户感受指数后,数据集中每个第一记录都为如下所示的格式:S2024: Construct a user experience index. In order to better describe the actual user experience, a user experience index is constructed. After adding the user experience index, each first record in the data set is in the following format:

Lnew={XHGU,YSTB,ZTS,TUE}L new ={X HGU , Y STB , Z TS , T UE }

所述用户感受指标包括:卡顿占比、卡顿频率、网络得分、资源请求得分、资源请求成功率。The user experience indicators include: freeze ratio, freeze frequency, network score, resource request score, and resource request success rate.

所述卡顿占比计算方式为:卡顿总时长除以用户观看时长,The calculation method of the freeze ratio is: the total freeze duration divided by the user's viewing time,

所述卡顿频率计算方式为:卡顿次数除以用户观看时长,用户观看时长需要转换为以小时为单位。The freeze frequency is calculated as follows: the number of freezes is divided by the user's viewing time, and the user's viewing time needs to be converted into hours.

所述网络得分计算方式为:机顶盒探针数据中TCP建立连接平均时长、TCP平均重传率、TCP连接失败率(1-TCP连接成功率)的平均值。The network score is calculated as follows: the average value of the average duration of TCP connection establishment, the average TCP retransmission rate, and the TCP connection failure rate (1-TCP connection success rate) in the set-top box probe data.

所述资源请求得分计算方式为:机顶盒探针数据中m3u8文件请求响应平均延时、媒体文件请求响应平均延时、EPG请求响应平均延时的平均值The resource request score is calculated as follows: the average of the average delay of m3u8 file request response, the average delay of media file request response, and the average delay of EPG request response in the set-top box probe data.

所述资源请求成功率计算方式为:机顶盒探针数据中m3u8文件请求成功率、媒体文件请求成功率、EPG请求成功率的平均值。The resource request success rate is calculated as follows: the average value of the m3u8 file request success rate, the media file request success rate, and the EPG request success rate in the set-top box probe data.

S2025:数据离散化。关联规则分析采用投诉数据集和静默质差数据集进行。在进行关联规则分析前需要进行离散化操作。离散化采用等距离散,连续数据类型离散成5份。S2025: Data discretization. Association rule analysis is performed using the complaint data set and the silent quality-poor data set. Discretization is required before association rule analysis. Discretization uses equidistance discretization, and the continuous data type is discretized into 5 parts.

S203:关联规则分析。关联分析由多种算法可以使用,本申请实用Apriori算法实现。关联规则的步骤如下:S203: Association rule analysis. Association analysis can be performed by a variety of algorithms. This application uses the Apriori algorithm. The steps of association rules are as follows:

S2031:第k次扫描交易数据集时,产生频繁k项集。S2031: When scanning the transaction data set for the kth time, a frequent k-item set is generated.

S2032:清除不满足条件的候选集。所述条件为,为最小支持度计数,即项集出现的次数,采用第三阈值进行控制。本实施例第三阈值为10000。S2032: Clear candidate sets that do not meet the condition. The condition is that the minimum support count, that is, the number of times the item set appears, is controlled by a third threshold. In this embodiment, the third threshold is 10,000.

S2033:重复S2031直至无法产生更高频繁项集。至此,频繁项集生成完毕。S2033: Repeat S2031 until no more frequent itemsets can be generated. At this point, the generation of frequent itemsets is complete.

S2034:遍历频繁项集中的项集组合,生成规则集。S2034: Traverse the item set combinations in the frequent item set to generate a rule set.

S2035:计算规则集的相关指标,包含:支持度Sup、置信度Cof和提升度Lift。S2035: Calculate the relevant indicators of the rule set, including: support Sup, confidence Cof and lift Lift.

S2036:对提升度进行过滤,过滤出支持度计数SupNum大于第三阈值,置信度Cof大于第四阈值,提升度Lift大于第五阈值的规则,规则中包含用户感受指数任意字段。将满足条件的规则更新入知识数据库。本实施例中第第四阈值为0.4,第五阈值为2。S2036: Filter the lift to filter out rules whose support count SupNum is greater than the third threshold, confidence Cof is greater than the fourth threshold, and lift Lift is greater than the fifth threshold, and the rules contain any field of the user perception index. Update the rules that meet the conditions into the knowledge database. In this embodiment, the fourth threshold is 0.4 and the fifth threshold is 2.

知识数据库中每个规则都由五部分组成,质差根因、用户感受指数、支持度、置信度、提升度。Each rule in the knowledge database consists of five parts: root cause of poor quality, user perception index, support, confidence, and improvement.

进一步地,知识数据库中根因分成四种、机顶盒硬件问题、机顶盒软件问题、内容源问题、网络问题。Furthermore, the root causes in the knowledge database are divided into four types: set-top box hardware problems, set-top box software problems, content source problems, and network problems.

所述机顶盒硬件问题,主要体现在CPU和内存长期高占用、硬件老化等。The set-top box hardware problems are mainly reflected in long-term high CPU and memory occupancy, hardware aging, etc.

所述机顶盒软件问题,主要体现在软件、固件、系统、探针版本普遍出现质差现象。The set-top box software problems are mainly reflected in the general poor quality of software, firmware, system and probe versions.

所述内容源问题,主要体现在用户网络质量优秀,但是特定内容源出现质差现象。The content source problem is mainly reflected in the fact that the user's network quality is excellent, but the quality of a specific content source is poor.

所述网络问题,主要指家庭网关下挂设备过多、带宽不足等问题。The network problems mentioned above mainly refer to the problems of too many devices connected to the home gateway and insufficient bandwidth.

进一步地,知识数据库中还应该包括,运维人员对各类根因的解决方案。Furthermore, the knowledge database should also include the operations and maintenance personnel's solutions to various root causes.

实施例3显示了如何进行实时诊断。实施例3的流程如图6所示。Example 3 shows how to perform real-time diagnosis. The process of Example 3 is shown in FIG6 .

S301:用户终端发出请求。上报最近1分钟的机顶盒探针数据情况。S301: The user terminal sends a request to report the set-top box probe data of the most recent minute.

S302:计算用户感受指数。S302: Calculate the user experience index.

S303:对上报数据进行离散化。将上报数据按照S2025所述步骤进行离散化。S303: Discretize the reported data. Discretize the reported data according to the steps described in S2025.

S304:查询知识库。通过对离散化后的上报字段与知识库对比,筛选出若干与知识库规则质差根因与用户感受指数相似度高的规则。对符合的规则按照置信度排序。S304: Query the knowledge base. By comparing the discretized reported fields with the knowledge base, select several rules with high similarity to the root causes of poor quality and user experience index of the knowledge base rules. Sort the rules that meet the requirements by confidence.

所述相似度高的定义为:知识库规则中字段与上报数据字段相同的数量,除以知识库规则字段数。The definition of high similarity is: the number of fields in the knowledge base rules that are the same as the reported data fields, divided by the number of knowledge base rule fields.

S305:返回根因,并返回相关建议。S305: Return the root cause and related suggestions.

特别地,实时诊断可以主动下发进行。通过拨测用户终端,下发命令上报实时数据进行诊断。本申请不对实时诊断请求的发起方做限制。In particular, real-time diagnosis can be actively issued and performed by dialing the user terminal and issuing a command to report real-time data for diagnosis. This application does not restrict the initiator of the real-time diagnosis request.

特别地,运营商可以周期性对网络进行实时诊断。In particular, operators can periodically perform real-time diagnosis on the network.

实施例4,图7为本发明实施例4的流程图,该实施例为本申请的完整的工作流程。是对实施例1、实施例2、实施例3的一个综合应用,以展示本发明的一个总体的工作流程,实施例4仅为本发明的一个较优实施例。Embodiment 4, FIG. 7 is a flow chart of Embodiment 4 of the present invention, which is a complete workflow of the present application. It is a comprehensive application of Embodiment 1, Embodiment 2, and Embodiment 3 to demonstrate an overall workflow of the present invention, and Embodiment 4 is only a preferred embodiment of the present invention.

实施例4涉及四个主题:用户终端、网络服务器、数据库、知识库。所述的用户终端如实施例1所述。所述的网络服务器如实施例2所述。所述的知识库如实施例2所述。所述的数据库非本发明的关键部分,对此不作限制。Embodiment 4 involves four topics: user terminal, network server, database, and knowledge base. The user terminal is as described in Embodiment 1. The network server is as described in Embodiment 2. The knowledge base is as described in Embodiment 2. The database is not a key part of the present invention and is not limited thereto.

实施例4的具体步骤如下:The specific steps of Example 4 are as follows:

S401:数据上报。实施例四是对总体过程的一个实现,其计算方法如实施例一所述。S401: Data reporting. Embodiment 4 is an implementation of the overall process, and its calculation method is the same as that described in Embodiment 1.

S4011:开机。为用户主动开启用户终端。S4011: Turn on the terminal. Actively turn on the user terminal for the user.

S4012:上报用户终端开机启动数据。通过如实施例一所述的方法进行采集开机启动数据,所述的开机启动数据所包含的内容如实施例一所述。S4012: Reporting user terminal startup data. The startup data is collected by the method described in the first embodiment, and the content of the startup data is as described in the first embodiment.

S4013:数据存储。将开机启动数据进行存储。S4013: Data storage. The startup data is stored.

S4014:周期上报数据。所述周期上报数据包括机顶盒探针数据和网关探针数据。S4014: Periodically reporting data. The periodically reported data includes set-top box probe data and gateway probe data.

S4015:数据存储。将周期上报数据进行存储。S4015: Data storage: The periodically reported data is stored.

S402:分析历史数据。所述的历史数据包括机顶盒探针数据和网关探针数据,特别的,所述的历史数据还应该包括宽带用户投诉数据,所述投诉数据应包括设备号、投诉原因分类等信息。S402: Analyze historical data. The historical data includes set-top box probe data and gateway probe data. In particular, the historical data should also include broadband user complaint data, and the complaint data should include information such as device number, complaint reason classification, etc.

S4021:抽取历史数据。从数据库中抽取所需数据,具体需要的数据范围如实施例二所述。S4021: Extract historical data. Extract required data from the database. The specific required data range is as described in the second embodiment.

S4022:计算质量差的原因,形成知识库。所述的知识库的生成过程如实施例二所述。S4022: Calculate the reasons for poor quality and form a knowledge base. The process of generating the knowledge base is as described in the second embodiment.

S4023:更新和维护知识库。所述的更新和维护过程包括:增加、删除、修改。具体过程如实施例二所述。图8为知识库数据样例。S4023: Update and maintain the knowledge base. The update and maintenance process includes: adding, deleting, and modifying. The specific process is as described in Example 2. Figure 8 is a sample of knowledge base data.

S403:知识库辅助实时诊断和优化。所述“辅助实时诊断和优化”特指以下场景:装维人员上门检测用户终端的问题,进行实时的测试,判断网络状况,给出相应的解决方案。辅助实时诊断和优化是对实施例三所述的知识库的一个应用。所述“诊断”指确定用户质量差的原因。所述“优化”指通过修改特定参数,提高用户使用体验的过程。S403: Knowledge base assists in real-time diagnosis and optimization. The "assisting real-time diagnosis and optimization" refers specifically to the following scenario: installation and maintenance personnel visit user terminals to detect problems, conduct real-time tests, determine network conditions, and provide corresponding solutions. Assisting real-time diagnosis and optimization is an application of the knowledge base described in Example 3. The "diagnosis" refers to determining the cause of poor user quality. The "optimization" refers to the process of improving the user experience by modifying specific parameters.

S4031:上报实时测量数据,辅助诊断。该过程需要装维人员进行操作。在进行过实时测量后,实时测量数据将被上传至网络服务器。所述的实时测量数据包括:机顶盒探针数据。所述机顶盒探针数据范围如实施例一所述。S4031: Report real-time measurement data to assist diagnosis. This process requires installation and maintenance personnel to operate. After the real-time measurement is performed, the real-time measurement data will be uploaded to the network server. The real-time measurement data includes: set-top box probe data. The range of the set-top box probe data is as described in Example 1.

S4032:从知识库抽取相关知识。从知识库中获取相关信息以便进行诊断。S4032: Extract relevant knowledge from the knowledge base. Obtain relevant information from the knowledge base for diagnosis.

S4033:计算质量差原因。计算质差的方法如实施例三所示。S4033: Calculate the quality difference reason. The method for calculating the quality difference is as shown in the third embodiment.

S4034:下发诊断结果和优化方案。对S4033的计算结果进行下发,实现网络优化。S4034: Send the diagnosis result and optimization plan. Send the calculation result of S4033 to achieve network optimization.

S4035:反馈优化后效果。通过装维人员返回优化效果。此步骤非本申请核心,本申请不对此进行限制。S4035: Feedback on the optimization effect. The optimization effect is returned by the installation and maintenance personnel. This step is not the core of this application and is not limited by this application.

S4036:更新和维护知识库。通过外部知识更新和维护知识库,可以有效地提高知识库的准确性。此步骤非本申请核心,本申请不对此进行限制。S4036: Update and maintain the knowledge base. By updating and maintaining the knowledge base with external knowledge, the accuracy of the knowledge base can be effectively improved. This step is not the core of this application and is not limited by this application.

实施例4是对本申请应用的一个实施,目的是为了便于理解。实施例4所述内容非本申请所保护范围。实施例1、实施例2、实施例3所述内容为本申请的较优实施例。Example 4 is an implementation of the application of this application for the purpose of facilitating understanding. The content described in Example 4 is not within the scope of protection of this application. The contents described in Example 1, Example 2, and Example 3 are preferred embodiments of this application.

一种用户视频质差根因分析方法,包括:A root cause analysis method for poor quality of user videos, comprising:

通过对历史数据的分析建立知识库;Build a knowledge base through analysis of historical data;

通过多种方式上报实时数据、通过与知识库对比分析质差根因,并下发根因和优化方案。Report real-time data through various means, analyze the root causes of poor quality by comparing with the knowledge base, and issue root causes and optimization plans.

使用历史数据,通过关联规则分析形成知识库。Use historical data to form a knowledge base through association rule analysis.

所述历史数据,包括机顶盒探针数据和网关探针数据。The historical data includes set-top box probe data and gateway probe data.

对机顶盒探针数据与网关探针数据进行关联,并于投诉数据进行关联,将关联后的数据分为投诉数据集和非投诉数据集。The set-top box probe data is associated with the gateway probe data and the complaint data, and the associated data is divided into a complaint data set and a non-complaint data set.

使用聚类方法构造静默质差数据集。Use clustering methods to construct silent poor quality datasets.

使用投诉数据做为聚类算法的初始聚类中心。Use complaint data as the initial cluster center of the clustering algorithm.

静默质差数据集构造中对第一至第八数据类型使用不同相似度计算方式。Different similarity calculation methods are used for the first to eighth data types in the construction of the silent poor quality data set.

确定静默质差数据集的规则为:簇内含有至少一个投诉用户数据。The rule for determining the silent poor quality data set is: the cluster contains at least one complaint user data.

对投诉数据集和静默质差数据集中的数据计算其用户感受指数。The user perception index is calculated for the data in the complaint dataset and the silent poor quality dataset.

所述用户感受指数包括:卡顿占比、卡顿频率、网络得分、资源请求得分、资源请求成功率;The user experience index includes: freeze ratio, freeze frequency, network score, resource request score, and resource request success rate;

所述卡顿占比计算方式为:卡顿总时长除以用户观看时长;The calculation method of the jam ratio is: the total jam duration divided by the user's viewing time;

所述卡顿频率计算方式为:卡顿次数除以用户观看时长,用户观看时长需要转换为以小时为单位;The freeze frequency is calculated as follows: the freeze frequency is divided by the user's viewing time, and the user's viewing time needs to be converted into hours;

所述网络得分计算方式为:机顶盒探针数据中TCP建立连接平均时长、TCP平均重传率、TCP连接失败率(1-TCP连接成功率)的平均值;The network score is calculated as follows: the average value of the average TCP connection establishment time, the average TCP retransmission rate, and the TCP connection failure rate (1-TCP connection success rate) in the set-top box probe data;

所述资源请求得分计算方式为:机顶盒探针数据中m3u8文件请求响应平均延时、媒体文件请求响应平均延时、EPG请求响应平均延时的平均值;The resource request score is calculated as follows: the average of the average delay of the m3u8 file request response, the average delay of the media file request response, and the average delay of the EPG request response in the set-top box probe data;

所述资源请求成功率计算方式为:机顶盒探针数据中m3u8文件请求成功率、媒体文件请求成功率、EPG请求成功率的平均值。The resource request success rate is calculated as follows: the average value of the m3u8 file request success rate, the media file request success rate, and the EPG request success rate in the set-top box probe data.

所述的实时诊断方法通过对比知识库实现。The real-time diagnosis method is implemented by comparing the knowledge base.

知识库规则包含五个部分:质差根因、用户感受指数、支持度、置信度和提升度。The knowledge base rules include five parts: root cause of poor quality, user perception index, support, confidence and improvement.

所述地对比知识库依赖知识库规则质差根因与用户感受指数相似度;The comparative knowledge base relies on the root cause of poor quality of knowledge base rules and the similarity of user perception index;

所述相似度的定义为:知识库规则中字段与上报数据字段相同的数量,除以知识库规则字段数。The similarity is defined as: the number of fields in the knowledge base rules that are identical to the reported data fields, divided by the number of knowledge base rule fields.

所述用户视频质差根因由相似度高的规则决定。The root cause of poor quality of the user video is determined by the rules with high similarity.

比较时首先匹配规则中用户感受指数,然后计算相似度,按照规则的置信度排序,作为疑似质差根因。When comparing, first match the user perception index in the rule, then calculate the similarity, and sort by the confidence of the rule as the suspected root cause of the poor quality.

质差根因被分成四类,包括:机顶盒硬件问题、机顶盒软件问题、内容源问题、网络问题。The root causes of poor quality are divided into four categories, including: set-top box hardware problems, set-top box software problems, content source problems, and network problems.

所述知识库还应该包括运维人员对各类根因的解决方案。The knowledge base should also include solutions to various root causes provided by operations and maintenance personnel.

一种电子设备,包括:存储器、处理器、接收器、显示器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行上述任一项所述的用户视频质差根因分析方法。An electronic device comprises: a memory, a processor, a receiver, a display and a computer program, wherein the computer program is stored in the memory, and the processor runs the computer program to execute any one of the above-mentioned methods for analyzing root causes of poor user video quality.

一种存储介质,包括:可读存储介质和存储在所述可读存储介质中的计算集程序,所述计算机程序用于实现上述任一项所述的用户视频质差根因分析方法。A storage medium comprises: a readable storage medium and a computer program stored in the readable storage medium, wherein the computer program is used to implement any of the above-mentioned root cause analysis methods for poor user video quality.

本发明说明书中未作详细描述的内容属于本领域专业技术人员公知的现有技术。在此指明,以上叙述有助于本领域技术人员理解本发明创造,但并非限制本发明创造的保护范围。任何没有脱离本发明创造实质内容的对以上叙述的等同替换、修饰改进和/或删繁从简而进行的实施,均落入本发明创造的保护范围。The contents not described in detail in the specification of the present invention belong to the prior art known to the professional and technical personnel in the field. It is pointed out here that the above description helps those skilled in the art to understand the invention, but does not limit the protection scope of the invention. Any equivalent replacement, modification and/or simplification of the above description without departing from the essence of the invention falls within the protection scope of the invention.

Claims (8)

1.一种用户视频质差根因分析方法,其特征在于,包括以下步骤:1. A method for analyzing the root causes of poor video quality, comprising the following steps: 步骤S201,对上报数据进行数据预处理并形成有效数据,所述上报数据包括通过采集用户视频终端运行数据所形成的上报相关数据;Step S201, preprocessing the reported data to form valid data, wherein the reported data includes report-related data formed by collecting the operation data of the user video terminal; 步骤S202,利用投诉数据和所述有效数据进行数据集构造;Step S202, constructing a data set using the complaint data and the valid data; 步骤S203,对所述数据集构造中的数据进行关联规则分析并形成知识库,所述知识库包括具有以下列名的数据表:support支持度,confidence置信度,lift提升度,reason根因,以及result结果;Step S203, performing association rule analysis on the data in the data set construction and forming a knowledge base, the knowledge base including data tables with the following column names: support, confidence, lift, reason, and result; 所述步骤S201中的终端运行数据包括机顶盒探针数据和网关探针数据,并进行周期上传,所述周期由第一阈值限制,在互联网电视业务质差根因排查过程中,通过机顶盒探针数据和网关探针数据的结合,能够更准确的定位家庭局域网内的故障点,提高排障效率;The terminal operation data in step S201 includes set-top box probe data and gateway probe data, and is uploaded periodically, and the period is limited by the first threshold. In the process of troubleshooting the root cause of poor quality of Internet TV services, the combination of set-top box probe data and gateway probe data can more accurately locate the fault point in the home local area network, thereby improving troubleshooting efficiency; 所述数据预处理包括过滤数据,所述过滤数据包括将无观看行为的数据过滤掉;所述数据集构造包括将投诉数据与终端运行数据进行关联,筛选出时间差满足第二阈值的用户数据作为投诉数据集,将其他数据作为非投诉数据集,通过AP聚类算法进行不完备数据填充,将填充成功的数据标记为静默质差数据集,至此,共构成三种数据集分别为:投诉数据集,静默质差数据集,非投诉数据集;所述关联规则分析包括对投诉数据集和静默质差数据集中记录构建用户感受指数,并进行离散化,对离散化的数据进行关联规则分析,输出关联规则记录作为知识库数据,所述知识库还包括针对不同根因的优化建议。The data preprocessing includes filtering data, and the filtering data includes filtering out data without viewing behavior; the data set construction includes associating complaint data with terminal operation data, screening out user data with a time difference that meets a second threshold as a complaint data set, and using other data as a non-complaint data set, filling in incomplete data through an AP clustering algorithm, and marking the successfully filled data as a silent poor quality data set. So far, a total of three data sets are constructed, namely: a complaint data set, a silent poor quality data set, and a non-complaint data set; the association rule analysis includes constructing a user perception index for records in the complaint data set and the silent poor quality data set, and discretizing them, performing association rule analysis on the discretized data, and outputting association rule records as knowledge base data. The knowledge base also includes optimization suggestions for different root causes. 2.根据权利要求1所述的用户视频质差根因分析方法,其特征在于,所述终端运行数据包括用户投诉数据和接入网资源树数据,所述终端运行数据通过TCP方式上报到网络服务器。2. The method for analyzing the root causes of poor video quality according to claim 1 is characterized in that the terminal operation data includes user complaint data and access network resource tree data, and the terminal operation data is reported to the network server via TCP. 3.根据权利要求1所述的用户视频质差根因分析方法,其特征在于,包括利用知识库对用户终端的用户视频进行实时诊断方法:用户终端主动实时上报机顶盒探针数据,对实时数据进行离散化处理,比对知识库,返回根因分析结果和优化建议;或者,通过远程拨测用户终端,下发采集命令,上传实时数据,比对知识库,返回根因分析结果和优化建议;或者,运营商进行周期拨测用户终端,获取用户终端的网络质量状态,比对知识库,返回根因分析结果和优化建议。3. The method for root cause analysis of poor user video quality according to claim 1 is characterized in that it includes a method for real-time diagnosis of user videos of user terminals using a knowledge base: the user terminal actively reports set-top box probe data in real time, discretizes the real-time data, compares it with the knowledge base, and returns root cause analysis results and optimization suggestions; or, by remotely dialing the user terminal, issuing a collection command, uploading real-time data, comparing it with the knowledge base, and returning root cause analysis results and optimization suggestions; or, the operator periodically dials the user terminal to obtain the network quality status of the user terminal, compares it with the knowledge base, and returns the root cause analysis results and optimization suggestions. 4.根据权利要求1所述的用户视频质差根因分析方法,其特征在于,所述步骤S202中的数据集构造包括以下步骤:4. The method for analyzing the root causes of poor video quality according to claim 1, wherein the data set construction in step S202 comprises the following steps: 步骤S2021,对资源树、网关探针数据、机顶盒探针数据和投诉数据进行数据关联;Step S2021, data association is performed on the resource tree, gateway probe data, set-top box probe data, and complaint data; 步骤S2022,进行数据分类,分别形成投诉数据集和非投诉数据集;Step S2022, classifying the data to form a complaint data set and a non-complaint data set; 步骤S2023,进行投诉数据集扩充并形成静默质差数据集;Step S2023, expanding the complaint data set and forming a silent poor quality data set; 步骤S2024,利用投诉数据集和静默质差数据集构建用户感受指数;步骤S2025,数据离散化并形成关联分析数据集。Step S2024, constructing a user perception index using the complaint data set and the silent poor quality data set; Step S2025, discretizing the data and forming a correlation analysis data set. 5.根据权利要求4所述的用户视频质差根因分析方法,其特征在于,所述步骤S2023中的投诉数据集扩充采用AP聚类算法实现,包括以下步骤:5. The method for analyzing the root causes of poor video quality according to claim 4, wherein the complaint data set expansion in step S2023 is implemented by using an AP clustering algorithm, comprising the following steps: 步骤S20231,算法初始化,包括初始化矩阵R,A,S;Step S20231, algorithm initialization, including initialization of matrices R, A, S; 步骤S20232,更新吸引度矩阵R;Step S20232, update the attraction matrix R; 步骤S20233,更新归属度矩阵A;Step S20233, updating the attribution matrix A; 步骤S20234,根据衰减系数进行衰减操作;Step S20234, performing an attenuation operation according to the attenuation coefficient; 步骤S20235,判断是否矩阵稳定或达到最大迭代次数,如果否,则返回步骤S20232,如果是,则进入步骤S20236;Step S20235, determine whether the matrix is stable or the maximum number of iterations has been reached, if not, return to step S20232, if yes, proceed to step S20236; 步骤S20236,选取A+R最大的点作为聚类中心;Step S20236, selecting the point with the largest A+R as the cluster center; 步骤S20237,导出静默质差数据集;Step S20237, exporting a silent poor quality data set; 步骤S20238,更新非投诉数据集。Step S20238, update the non-complaint data set. 6.一种采用上述权利要求1-5之一所述的用户视频质差根因分析方法的应用网络,其特征在于,包括用户终端,网络服务器,数据库,和知识库,联合执行以下步骤:6. An application network using the method for analyzing the root causes of poor video quality according to any one of claims 1 to 5, characterized in that it comprises a user terminal, a network server, a database, and a knowledge base, which jointly perform the following steps: 步骤S401,数据上报;Step S401, data reporting; 步骤S402,分析历史数据;Step S402, analyzing historical data; 步骤S403,知识库辅助实时诊断和优化;Step S403, knowledge base assists in real-time diagnosis and optimization; 步骤S401中包括以下步骤:Step S401 includes the following steps: 步骤S4011,用户终端开机;Step S4011, the user terminal is turned on; 步骤S4012,向网络服务器上报用户终端开机启动数据;Step S4012, reporting user terminal startup data to the network server; 步骤S4013,网络服务器将用户终端开机启动数据存储到数据库;Step S4013, the network server stores the user terminal startup data in the database; 步骤S4014,用户终端周期上报数据到网络服务器;Step S4014, the user terminal periodically reports data to the network server; 步骤S4015,网络服务器将周期上报数据储到数据库;Step S4015, the network server stores the periodic reporting data in a database; 步骤S402中包括以下步骤:Step S402 includes the following steps: 步骤S4021,网络服务器从数据库中抽取历史数据;Step S4021, the network server extracts historical data from the database; 步骤S4022,网络服务器进行关联规则分析,形成知识库;Step S4022, the network server performs association rule analysis to form a knowledge base; 步骤S4023,网络服务器更新和维护知识库;Step S4023, the network server updates and maintains the knowledge base; 步骤S403中包括以下步骤:Step S403 includes the following steps: 步骤S4031,用户终端向网络服务器上报实时测量数据,辅助诊断;Step S4031, the user terminal reports real-time measurement data to the network server to assist diagnosis; 步骤S4032,网络服务器从知识库抽取相关知识;Step S4032, the network server extracts relevant knowledge from the knowledge base; 步骤S4033,网络服务器计算质差原因;Step S4033, the network server calculates the reason for the poor quality; 步骤S4034,网络服务器向用户终端下发诊断结果和优化方案;Step S4034, the network server sends the diagnosis result and optimization plan to the user terminal; 步骤S4035,用户终端向网络服务器反馈优化后效果;Step S4035, the user terminal feeds back the optimization effect to the network server; 步骤S4036,网络服务器更新和维护知识库。Step S4036, the network server updates and maintains the knowledge base. 7.一种电子设备,其特征在于,包括:存储器、处理器、接收器、显示器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序以执行上述权利要求1-5之一所述的用户视频质差根因分析方法。7. An electronic device, characterized in that it comprises: a memory, a processor, a receiver, a display and a computer program, wherein the computer program is stored in the memory, and the processor runs the computer program to execute the user video quality poor root cause analysis method described in one of claims 1-5. 8.一种存储介质,其特征在于,包括:可读存储介质和存储在所述可读存储介质中的计算机程序,所述计算机程序用于实现上述权利要求1-5之一所述的用户视频质差根因分析方法。8. A storage medium, characterized in that it comprises: a readable storage medium and a computer program stored in the readable storage medium, wherein the computer program is used to implement the method for analyzing root causes of poor user video quality as described in any one of claims 1 to 5.
CN202210372229.2A 2022-04-11 2022-04-11 User video quality difference root cause analysis method, electronic equipment and storage medium Active CN114997560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210372229.2A CN114997560B (en) 2022-04-11 2022-04-11 User video quality difference root cause analysis method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210372229.2A CN114997560B (en) 2022-04-11 2022-04-11 User video quality difference root cause analysis method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114997560A CN114997560A (en) 2022-09-02
CN114997560B true CN114997560B (en) 2024-11-01

Family

ID=83024156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210372229.2A Active CN114997560B (en) 2022-04-11 2022-04-11 User video quality difference root cause analysis method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114997560B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116614398A (en) * 2023-06-14 2023-08-18 深圳市友华通信技术有限公司 Method, device, equipment and storage medium for automatically filling task parameters

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112702224A (en) * 2020-12-10 2021-04-23 北京直真科技股份有限公司 Method and device for analyzing quality difference of home broadband user
CN113971425A (en) * 2020-07-22 2022-01-25 中移(苏州)软件技术有限公司 An abnormality analysis method, device and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7536370B2 (en) * 2004-06-24 2009-05-19 Sun Microsystems, Inc. Inferential diagnosing engines for grid-based computing systems
US8938749B2 (en) * 2010-08-31 2015-01-20 At&T Intellectual Property I, L.P. System and method to troubleshoot a set top box device
CN108768702A (en) * 2018-05-15 2018-11-06 华为技术有限公司 Network analysis method and equipment
CN111404762B (en) * 2019-01-02 2022-09-16 中国移动通信有限公司研究院 User video quality poor positioning method and device
CN110955575A (en) * 2019-11-14 2020-04-03 国网浙江省电力有限公司信息通信分公司 A business system fault location method based on correlation analysis model
US11630718B2 (en) * 2020-05-14 2023-04-18 At&T Intellectual Property I, L.P. Using user equipment data clusters and spatial temporal graphs of abnormalities for root cause analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113971425A (en) * 2020-07-22 2022-01-25 中移(苏州)软件技术有限公司 An abnormality analysis method, device and storage medium
CN112702224A (en) * 2020-12-10 2021-04-23 北京直真科技股份有限公司 Method and device for analyzing quality difference of home broadband user

Also Published As

Publication number Publication date
CN114997560A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN113381890A (en) Alarm information association method and device, electronic equipment and readable storage medium
CN110413599A (en) Generating date and storage system and method
US10248674B2 (en) Method and apparatus for data quality management and control
CN113271541B (en) Method and device for acquiring terminal behavior data, method and device for transmitting terminal behavior data and network equipment
CN116662371A (en) A cross-domain data fusion method
US7779113B1 (en) Audit management system for networks
CN118673528A (en) Intelligent archive management method and system based on artificial intelligence
CN114997560B (en) User video quality difference root cause analysis method, electronic equipment and storage medium
CN106202232A (en) Power failure event analysis method and device
CN117221088A (en) Computer network intensity detection system and device
CN109783553A (en) A kind of power distribution network mass data increased quality system
CN116841829A (en) Mobile terminal application performance monitoring method
CN120162213A (en) Risk warning method, system and electronic equipment
CN110532153A (en) A kind of business level user's operation experience visualization system
CN119025566A (en) Cache data management method and system based on distributed encrypted storage
CN119396672A (en) A method and system for intelligently evaluating software system performance data
CN118885971A (en) A heterogeneous data fusion method, device, equipment and storage medium
CN110543509B (en) Monitoring system, method and device for user access data and electronic equipment
CN118747164A (en) A log-based risk management method and system
CN117914883A (en) Cloud network comprehensive performance and performance evaluation system and method
CN108156012B (en) Network fault reporting data multi-dimensional classification statistical analysis method and device
CN116633673A (en) Data safety transmission system for comprehensive energy platform
CN116126647A (en) A data linkage analysis system adapted to digital enterprises
KR100812946B1 (en) Service Quality Management System and Method in Mobile Communication Network
CN120611070B (en) Method, system, device and storage medium for federated collaborative optimization of time-series knowledge graphs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant