CN111310044A - Method, device, device and storage medium for extracting page element information - Google Patents
Method, device, device and storage medium for extracting page element information Download PDFInfo
- Publication number
- CN111310044A CN111310044A CN202010093390.7A CN202010093390A CN111310044A CN 111310044 A CN111310044 A CN 111310044A CN 202010093390 A CN202010093390 A CN 202010093390A CN 111310044 A CN111310044 A CN 111310044A
- Authority
- CN
- China
- Prior art keywords
- element information
- page
- candidate
- extracting
- triggered
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本申请公开了一种页面元素信息的提取方法、装置、设备和存储介质,涉及数据挖掘技术领域。具体实现方案为:采集目标页面上被用户交互事件触发的元素信息集合;根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息。本申请实施例提取面向业务的、能明确体现用户行为语义的元素信息,以提高元素信息对产品和运营优化的应用价值。
The present application discloses a method, device, device and storage medium for extracting page element information, and relates to the technical field of data mining. The specific implementation scheme is: collecting the element information set triggered by the user interaction event on the target page; extracting the target element information from the element information set according to the attribute of the element associated with the page business and the index of the number of times the element is triggered. The embodiment of the present application extracts service-oriented element information that can clearly reflect the semantics of user behavior, so as to improve the application value of the element information for product and operation optimization.
Description
技术领域technical field
本申请涉及计算机技术,尤其涉及数据挖掘技术领域。The present application relates to computer technology, in particular to the technical field of data mining.
背景技术Background technique
埋点是网站分析的一种常用的数据分析方法,通过在网站或应用程序中加入一些程序代码,用以采集用户在该网站或应用程序中的浏览、访问数据和应用使用情况,分析用户交互行为,从而帮助产品和运营进行后续优化。Embedding is a common data analysis method for website analysis. By adding some program codes to a website or application, it is used to collect users' browsing, access data and application usage in the website or application, and to analyze user interaction. behaviors, which help in the subsequent optimization of products and operations.
现有技术中,一般通过采集用户交互行为所触发的元素信息,来分析用户交互行为。随着网站或应用程序的业务逐渐丰富,需要采集的元素信息越来越多,且元素信息反映的用户行为杂乱,对产品和运营优化的应用价值越来越低。In the prior art, the user interaction behavior is generally analyzed by collecting element information triggered by the user interaction behavior. With the gradual enrichment of the business of a website or application, more and more element information needs to be collected, and the user behavior reflected by the element information is messy, and the application value for product and operation optimization is getting lower and lower.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种页面元素信息的提取方法、装置、设备和存储介质,以提取对产品和运营优化有利用价值的元素信息。Embodiments of the present application provide a method, apparatus, device, and storage medium for extracting page element information, so as to extract element information that is valuable for product and operation optimization.
第一方面,本申请实施例提供了一种页面元素信息的提取方法,包括:In a first aspect, an embodiment of the present application provides a method for extracting page element information, including:
采集目标页面上被用户交互事件触发的元素信息集合;Collect the set of element information triggered by user interaction events on the target page;
根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息。The target element information is extracted from the element information set according to the attribute of the element associated with the page service and the index of the number of times the element is triggered.
本申请实施例中,通过采集目标页面上被用户交互事件触发的元素信息集合,获取全部的用户行为数据;通过根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息,从而提取出面向业务的、能明确体现用户行为语义的元素信息,以提高元素信息对产品和运营优化的应用价值。In this embodiment of the present application, all user behavior data is obtained by collecting the set of element information on the target page that is triggered by user interaction events. The target element information is extracted from the element information set, so as to extract the business-oriented element information that can clearly reflect the semantics of user behavior, so as to improve the application value of the element information for product and operation optimization.
可选的,所述元素的与页面业务关联的属性,包括:元素绑定的点击类事件、元素的按钮类型、元素的链接功能、小于设定值的元素内容长度、元素位于落地页和元素位于推广页面中的至少一种;Optionally, the attributes of the element associated with the page business include: click event bound to the element, button type of the element, link function of the element, content length of the element less than the set value, the element is located on the landing page and the element at least one of the promotion pages;
所述元素被触发的次数指标,包括:元素的转化率和/或点击率。The indicator of the number of times the element is triggered, including: the conversion rate and/or the click rate of the element.
上述申请中的一种可选实施方式,元素绑定的点击类事件、元素的按钮类型、元素的链接功能、小于设定值的元素内容长度、元素位于落地页和元素位于推广页面能够明确体现元素的业务属性,从而基于上述属性提取的目标元素信息为面向页面业务的元素信息;元素的转化率和/或点击率体现了用户对元素的聚焦行为,反映了明确的行为语义,有利于精准提取目标元素信息。In an optional implementation manner in the above application, the click event bound to the element, the button type of the element, the link function of the element, the content length of the element less than the set value, the element located on the landing page and the element located on the promotion page can be clearly reflected. The business attribute of the element, so the target element information extracted based on the above attributes is the element information for the page business; the conversion rate and/or click rate of the element reflects the user's focusing behavior on the element, reflecting the clear behavior semantics, which is conducive to accuracy Extract target element information.
可选的,根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息,包括:Optionally, extract target element information from the element information set according to the attribute of the element associated with the page business and the index of the number of times the element is triggered, including:
根据元素的与页面业务关联的属性,从所述元素信息集合中提取候选元素信息集合;Extracting a candidate element information set from the element information set according to the attribute of the element associated with the page business;
根据元素被触发的次数指标,从所述候选元素信息集合中,提取目标元素信息。The target element information is extracted from the candidate element information set according to the index of the number of times the element is triggered.
上述申请中的一种可选实施方式,属性可以直接从元素信息集合中获取到,而次数指标需要运算得到,优先依据属性对元素信息进行提取,再依据次数指标进行二次提取,能够快速提取目标元素信息并减少运算量。In an optional implementation manner in the above application, the attribute can be obtained directly from the element information set, and the number of times index needs to be obtained by calculation, and the element information is first extracted according to the attribute, and then the secondary extraction is carried out according to the number of times index, which can be quickly extracted. target element information and reduce the amount of computation.
可选的,所述根据元素的与页面业务关联的属性,从所述元素信息集合中提取候选元素信息集合,包括:Optionally, extracting a candidate element information set from the element information set according to the attribute of the element associated with the page business, including:
从所述元素信息集合中,提取被点击类事件触发的元素信息;From the element information set, extract the element information triggered by the click event;
从被点击类事件触发的元素信息中,提取按钮类型的元素信息、链接类型的元素信息、内容长度小于设定值的元素信息、位于落地页的元素信息和位于推广页面上的元素信息中的至少一种,构成候选元素信息集合。From the element information triggered by the click event, extract the element information of the button type, the element information of the link type, the element information of the content length less than the set value, the element information on the landing page and the element information on the promotion page. At least one of them constitutes a candidate element information set.
上述申请中的一种可选实施方式,点击类事件能够相比与其他交互事件更能体现用户对元素的聚焦;优先根据点击类事件类型对元素信息进行选取,相比于其他属性,能够最大范围地选取出可用的元素信息;然后根据元素的按钮、链接、内容长度和位于的页面,提取候选元素,有利于提高元素信息的提取效率。In an optional implementation manner in the above application, click events can better reflect the user's focus on elements compared with other interaction events; the element information is preferentially selected according to the click event type, and compared with other attributes, it can maximize the The available element information is selected in a range; then candidate elements are extracted according to the button, link, content length and page of the element, which is beneficial to improve the extraction efficiency of element information.
可选的,根据元素被触发的次数指标,从所述候选元素信息集合中,提取目标元素信息,包括:Optionally, extract target element information from the candidate element information set according to the index of the number of times the element is triggered, including:
计算每个候选元素的转化率和/或点击率;Calculate the conversion rate and/or click-through rate for each candidate element;
从所述候选元素信息集合中,将所述转化率在第一预设范围内的候选元素信息,和/或,所述点击率在第二预设范围内的候选元素信息,确定为目标元素信息。From the candidate element information set, the candidate element information with the conversion rate within the first preset range and/or the candidate element information with the click-through rate within the second preset range is determined as the target element information.
上述申请中的一种可选实施方式,转化率和点击率能显著反映用户对元素的聚焦行为,具有明确的行为语义;而转化率和点击率的数值能反映用户对元素的聚焦程度,也反映了行为语义的强度。基于此,可以根据业务需要选择在预设范围内的转化率和/或点击率,提取不同语义强度的目标元素信息。In an optional implementation of the above application, the conversion rate and the click rate can significantly reflect the user's focusing behavior on the element, and have clear behavioral semantics; and the conversion rate and the click rate can reflect the user's focus on the element. Reflects the strength of behavioral semantics. Based on this, the conversion rate and/or click rate within a preset range can be selected according to business needs, and target element information with different semantic strengths can be extracted.
可选的,计算每个候选元素的转化率,和/或点击率,包括:Optionally, calculate the conversion rate, and/or click-through rate for each candidate element, including:
获取用户与目标页面交互过程中生成的多个会话,每个会话包括预设时段内被点击类事件触发的元素信息;Obtain multiple sessions generated during the interaction between the user and the target page, and each session includes element information triggered by click events within a preset period;
根据包含每个候选元素信息的会话个数,以及所述多个会话的总数,计算每个候选元素的转化率;和/或,根据包含每个候选元素信息的会话个数,以及每个候选元素信息在所述目标页面上的显示次数,计算每个候选元素的点击率。Calculate the conversion rate of each candidate element according to the number of sessions containing information about each candidate element, and the total number of sessions; and/or, according to the number of sessions containing information about each candidate element, and each candidate element The number of times the element information is displayed on the target page, and the click-through rate of each candidate element is calculated.
上述申请中的一种可选实施方式,以会话为粒度,通过在预设时段内用户对候选元素的点击与否,计算转化率和点击率,避免用户短时间内反复点击候选元素带来的误差,提高转化率和点击率的准确性。In an optional implementation manner of the above application, the session is used as the granularity, and the conversion rate and click rate are calculated by whether the user clicks on the candidate element within a preset period of time, so as to avoid the user's repeated click on the candidate element in a short period of time. error, improve the accuracy of conversion rate and click-through rate.
可选的,在从所述候选元素信息集合中,将所述转化率在第一预设范围内的元素信息,和/或,所述点击率在第二预设范围内的元素信息,确定为目标元素信息之前,还包括:将候选元素信息集合中,所述转化率超过第一预设阈值的元素信息,和/或,所述点击率超过第二预设阈值的元素信息删除。Optionally, from the candidate element information set, the element information with the conversion rate within the first preset range, and/or the element information with the click-through rate within the second preset range, is determined. Before being the target element information, the method further includes: deleting the element information whose conversion rate exceeds the first preset threshold and/or the element information whose click rate exceeds the second preset threshold in the candidate element information set.
上述申请中的一种可选实施方式,次数指标较高的一般是常规触达的元素,面向页面的基础业务,如登录按钮、保存按钮等;这些元素信息难以反映用户的个性化行为语义;通过将次数指标高的候选元素信息删除,能够集中提取反映用户的个性化的行为语义的元素信息。In an optional implementation manner in the above application, the elements with higher frequency indicators are generally the elements that are routinely reached, and the basic services facing the page, such as the login button, the save button, etc.; the information of these elements is difficult to reflect the user's personalized behavior semantics; By deleting the candidate element information with a high frequency index, the element information reflecting the user's personalized behavior semantics can be centrally extracted.
可选的,在所述根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息之后,还包括:Optionally, after the target element information is extracted from the element information set according to the attribute of the element associated with the page business and the index of the number of times the element is triggered, the method further includes:
根据所述目标元素信息,对待埋点页面进行埋点。According to the target element information, the page to be embedded is embedded.
上述申请中的一种可选实施方式,根据目标元素信息,对待埋点页面进行埋点,以准确提取待埋点页面上,面向业务的、能明确体现用户行为语义的元素信息,而且这种埋点方法适用于初级用户,或者非专业分析师的用户,技术门槛低,从元素信息集合出发,填补元素信息集合和用户设置埋点之间的鸿沟,产出具有业务意义的埋点推荐结果。In an optional implementation manner of the above application, according to the target element information, the page to be buried is buried, so as to accurately extract the business-oriented element information on the page to be buried that can clearly reflect the semantics of user behavior, and this The tracking method is suitable for novice users or users who are not professional analysts. The technical threshold is low. Starting from the element information collection, it fills the gap between the element information collection and the user setting tracking, and produces business-meaning tracking recommendation results. .
第二方面,本申请实施例还提供了一种页面元素信息的提取装置,包括:In a second aspect, an embodiment of the present application further provides a device for extracting page element information, including:
采集模块,用于采集目标页面上被用户交互事件触发的元素信息集合;The collection module is used to collect the element information set triggered by the user interaction event on the target page;
提取模块,用于根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息。The extraction module is configured to extract target element information from the element information set according to the attribute of the element associated with the page service and the index of the number of times the element is triggered.
第三方面,本申请实施例还提供了一种电子设备,包括:In a third aspect, an embodiment of the present application also provides an electronic device, including:
至少一个处理器;以及at least one processor; and
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如第一方面实施例所提供的一种页面元素信息的提取方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform a method as provided by an embodiment of the first aspect The method for extracting page element information.
第四方面,本申请实施例还提供了一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行如第一方面实施例所提供的一种页面元素信息的提取方法。In a fourth aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are used to cause the computer to execute a page element provided by the embodiments of the first aspect Information extraction method.
上述可选方式所具有的其他效果将在下文中结合具体实施例加以说明。Other effects of the above-mentioned optional manners will be described below with reference to specific embodiments.
附图说明Description of drawings
附图用于更好地理解本方案,不构成对本申请的限定。其中:The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present application. in:
图1是本申请实施例一中的一种页面元素信息的提取方法的流程图;1 is a flowchart of a method for extracting page element information in Embodiment 1 of the present application;
图2是本申请实施例二中的一种页面元素信息的提取方法的流程图;2 is a flowchart of a method for extracting page element information in Embodiment 2 of the present application;
图3是本申请实施例三中的一种页面元素信息的提取装置的结构图;3 is a structural diagram of a device for extracting page element information in Embodiment 3 of the present application;
图4是用来实现本申请实施例的页面元素信息的提取方法的电子设备的框图。FIG. 4 is a block diagram of an electronic device used to implement the method for extracting page element information according to an embodiment of the present application.
具体实施方式Detailed ways
以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本申请的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
实施例一Example 1
图1是本申请实施例一中的一种页面元素信息的提取方法的流程图,本申请实施例适用于从用户交互行为所触发的页面元素信息中,提取对产品和运营优化的应用价值的元素信息的情况,该方法通过页面元素信息的提取装置执行,该装置采用软件和/或硬件实现,并具体配置于具备一定数据运算能力的电子设备中。FIG. 1 is a flowchart of a method for extracting page element information in Embodiment 1 of the present application. The embodiment of the present application is suitable for extracting the application value of product and operation optimization from page element information triggered by user interaction behavior. In the case of element information, the method is performed by a device for extracting page element information, which is implemented by software and/or hardware, and is specifically configured in an electronic device with a certain data computing capability.
如图1所示的一种页面元素信息的提取方法,包括:As shown in Figure 1, a method for extracting page element information includes:
S101、采集目标页面上被用户交互事件触发的元素信息集合。S101. Collect a set of element information on a target page that is triggered by a user interaction event.
目标页面可以是某网站的页面或者某应用程序的页面,目标页面的数量为至少一个。本实施例预先在目标页面上进行埋点,例如全埋点或代码埋点。通过埋点,对目标页面上的元素绑定设定的用户交互事件,当监听到用户交互事件时,触发绑定的元素信息。元素信息包括但不限于元素内容(或称为元素名称)、元素类型、元素位置和元素所在页面等。The target page may be a page of a website or a page of an application, and the number of target pages is at least one. This embodiment pre-embeds points on the target page, such as full point embedding or code embedding. By burying the point, the user interaction event set by the element binding on the target page, when the user interaction event is monitored, the bound element information is triggered. Element information includes, but is not limited to, element content (or element name), element type, element location, and the page where the element is located.
可选地,在用户与目标页面交互过程中,通过埋点实时采集被用户交互事件触发的元素信息,构成元素信息集合;或者,在通过埋点采集的元素信息中,选取部分或全部的元素信息,构成元素信息集合。Optionally, during the interaction between the user and the target page, the element information triggered by the user interaction event is collected in real time by burying points to form an element information set; or, in the element information collected by burying points, some or all of the elements are selected. information, which constitutes a set of element information.
S102、根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息。S102. Extract target element information from the element information set according to the attribute of the element associated with the page service and the index of the number of times the element is triggered.
页面上的元素具有多种属性,例如,元素内容、元素类型和元素所在页面等。页面业务指目标页面提供的功能、入口和服务等。有的元素属性与页面业务相关联,有的元素属性与页面业务无关联。元素属性与页面业务关联与否可以通过元素被触发后是否调用了页面业务确定,即是否调用了页面提供的功能、入口或服务。Elements on a page have properties such as element content, element type, and the page the element is on. Page business refers to the functions, portals and services provided by the target page. Some element attributes are associated with page services, and some element attributes are not associated with page services. Whether the element attribute is associated with the page business can be determined by whether the page business is called after the element is triggered, that is, whether the function, entry or service provided by the page is called.
可选地,元素的与页面业务关联的属性包括:元素绑定的点击类事件、元素的按钮类型、元素的链接功能、小于设定值的元素内容长度、元素位于落地页和元素位于推广页面中的至少一种。其中,点击类事件包括但不限于触摸、单击和双击等。页面响应于用户的点击类操作,调用页面业务;而且,点击类事件相较于滑动类事件,更能体现用户对元素的聚焦行为。类似的,页面响应于用户对按钮类型的元素的触发操作,调用相应的页面业务,并响应于用户对链接功能的元素的触发操作,跳转到相应的页面。对于小于设定值的元素内容长度,如果元素内容长度小于设定值,即元素内容是短文本,短文本的元素能够标识页面的服务或功能,例如页面标题、音乐、视频、游戏等标签。落地页是当潜在用户点击广告或者利用搜索引擎搜索后显示给用户的网页。一般这个页面会显示和所点击广告或搜索结果链接相关的扩展内容,而且这个页面应该是针对某个关键字(或短语)做过搜索引擎优化的。因此,落地页综合展示了提供的功能、入口和服务等,与页面业务紧密相关;而且,落地页相对于后续页面能直面用户的交互,更能体现用户对元素的聚焦程度。目前,拉新中的注册、促销中的下单等往往都是通过推广渠道触达的,推广页面上的元素被交互事件触发,将展示推广内容,或跳转到商家页面。Optionally, the attributes of the element associated with the page business include: the click event bound to the element, the button type of the element, the link function of the element, the content length of the element less than the set value, the element on the landing page and the element on the promotion page. at least one of them. The click events include but are not limited to touch, single click, and double click. The page calls the page business in response to the user's click-type operation; moreover, the click-type event can better reflect the user's focusing behavior on the element than the sliding-type event. Similarly, the page invokes the corresponding page service in response to the user's trigger operation on the button-type element, and jumps to the corresponding page in response to the user's trigger operation on the link function element. For the element content length less than the set value, if the element content length is less than the set value, that is, the element content is short text, and the element with short text can identify the service or function of the page, such as page title, music, video, game and other tags. A landing page is a webpage that is displayed to a user after a potential user clicks on an ad or searches with a search engine. Generally, this page will display expanded content related to the clicked ad or search result link, and this page should be search engine optimized for a certain keyword (or phrase). Therefore, the landing page comprehensively displays the provided functions, portals and services, etc., and is closely related to the page business; moreover, the landing page can directly face the user's interaction compared with the subsequent pages, and can better reflect the user's focus on elements. At present, the registration in the promotion process and the order in the promotion are often reached through the promotion channel. The elements on the promotion page are triggered by interactive events, and the promotion content will be displayed or jumped to the merchant page.
元素被触发的次数指标包括元素转化率和/或元素点击率。Elements fired metrics include element conversion rate and/or element click-through rate.
该次数指标反映用户对元素的聚焦程度,用户对元素的聚焦程度越高,反映的用户行为语义越明确。本实施例中的用户行为语义不同于图像或者文本中的语义,而是面向业务的用户行为语义,例如,用户点击页面上的阅读按钮,使用了页面提供的阅读业务,能够在一定程度上反映用户的行为语义,但不够明确,不能排除误操作或者随机操作的可能;再结合,阅读按钮被触发的次数指标在预设范围内,反映了用户明确的阅读行为语义。The times indicator reflects the user's degree of focus on the element. The higher the user's degree of focus on the element, the clearer the reflected user behavior semantics. The user behavior semantics in this embodiment is different from the semantics in images or texts, but is service-oriented user behavior semantics. For example, the user clicks the read button on the page and uses the reading service provided by the page, which can reflect to a certain extent The semantics of the user's behavior is not clear enough to rule out the possibility of misoperation or random operation; in combination, the indicator of the number of times the read button is triggered is within the preset range, reflecting the user's clear reading behavior semantics.
本实施例提取出的目标元素信息包括但不限于:元素内容、元素类型和元素所在页面(可以跨多个统一资源定位符)。The target element information extracted in this embodiment includes, but is not limited to: element content, element type, and the page where the element is located (which may span multiple uniform resource locators).
本申请实施例中,通过采集目标页面上被用户交互事件触发的元素信息集合,获取全部的用户行为数据;通过根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息,从而提取出面向业务的、能明确体现用户行为语义的元素信息,以提高元素信息对产品和运营优化的应用价值。In this embodiment of the present application, all user behavior data is obtained by collecting the set of element information on the target page that is triggered by user interaction events; through the attribute of the element associated with the page business and the index of the number of times the element is triggered, the The target element information is extracted from the collection, so as to extract the business-oriented element information that can clearly reflect the semantics of user behavior, so as to improve the application value of the element information for product and operation optimization.
进一步的,元素绑定的点击类事件、元素的按钮类型、元素的链接功能、小于设定值的元素内容长度、元素位于落地页和元素位于推广页面能够明确体现元素的业务属性,从而基于上述属性提取的目标元素信息为面向页面业务的元素信息;元素的转化率和/或点击率体现了用户对元素的聚焦行为,反映了明确的行为语义,有利于精准提取目标元素信息。Further, the click event bound to the element, the button type of the element, the link function of the element, the content length of the element less than the set value, the element located on the landing page and the element located on the promotion page can clearly reflect the business attributes of the element, so based on the above The target element information of attribute extraction is the element information oriented to the page business; the conversion rate and/or click rate of the element reflects the user's focusing behavior on the element, reflects the clear behavior semantics, and is conducive to accurately extracting the target element information.
实施例二Embodiment 2
图2是本申请实施例二中的一种页面元素信息的提取方法的流程图,本申请实施例在上述各实施例的技术方案的基础上进行了优化改进。FIG. 2 is a flowchart of a method for extracting page element information in Embodiment 2 of the present application. The embodiments of the present application are optimized and improved on the basis of the technical solutions of the foregoing embodiments.
进一步的,将操作“根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息”细化为“根据元素的与页面业务关联的属性,从元素信息集合中提取候选元素信息集合;根据元素被触发的次数指标,从候选元素信息集合中,提取目标元素信息”,以快速提取目标元素信息并减少运算量。Further, the operation "extract target element information from the element information set according to the attributes of the element associated with the page business and the index of the number of times the element is triggered" is refined into "according to the attributes of the element associated with the page business, from the element Extract the candidate element information set from the information set; extract the target element information from the candidate element information set according to the index of the number of times the element is triggered, so as to quickly extract the target element information and reduce the amount of calculation.
如图2所示的一种页面元素信息的提取方法,包括:A method for extracting page element information as shown in Figure 2, including:
S201、采集目标页面上被用户交互事件触发的元素信息集合。S201. Collect a set of element information on a target page that is triggered by a user interaction event.
S202、根据元素的与页面业务关联的属性,从元素信息集合中提取候选元素信息集合。S202. Extract the candidate element information set from the element information set according to the attribute of the element associated with the page service.
元素信息集合存储有元素的各个属性,基于此,可以直接在元素信息集合中进行属性比对,进而提取出具有与页面业务关联属性的候选元素信息集合。The element information set stores various attributes of the element. Based on this, attributes can be directly compared in the element information set, and then the candidate element information set with attributes associated with the page business can be extracted.
可选的,首先从元素信息集合中,提取被点击类事件触发的元素信息;接着,从被点击类事件触发的元素信息中,提取按钮类型的元素信息、链接类型的元素信息、内容长度小于设定值的元素信息、位于落地页的元素信息和位于推广页面上的元素信息中的至少一种,构成候选元素信息集合。Optionally, first extract the element information triggered by the click event from the element information set; then, from the element information triggered by the click event, extract the element information of the button type, the element information of the link type, and the content length less than At least one of the element information of the set value, the element information located on the landing page, and the element information located on the promotion page constitutes a set of candidate element information.
具体的,在提取被点击类事件触发的元素信息之后,提取被点击类事件触发的元素属性,包括元素类型(class)、是否为链接(href)、元素内容(content)长度和元素所在页面。之后,提取按钮(button)类型的元素信息、元素内容长度小于设定值的元素信息、链接类型的元素信息、位于落地页(Landing Page)的元素信息和位于推广页的元素信息,构成候选元素信息集合。为了方便描述和区分,将候选元素信息集合中的元素信息称为候选元素信息。Specifically, after extracting the element information triggered by the click event, extract the element attributes triggered by the click event, including the element type (class), whether it is a link (href), the length of the element content (content), and the page where the element is located. After that, extract the element information of the button (button) type, the element information of the element content length less than the set value, the element information of the link type, the element information of the landing page (Landing Page) and the element information of the promotion page to form candidate elements collection of information. For convenience of description and distinction, the element information in the candidate element information set is referred to as candidate element information.
本实施例中,点击类事件能够相比与其他交互事件更能体现用户对元素的聚焦;优先根据点击类事件类型对元素信息进行选取,相比于其他属性,能够最大范围地选取出可用的元素信息;然后根据元素的按钮、链接、内容长度和位于的页面,提取候选元素,有利于提高元素信息的提取效率。In this embodiment, the click event can better reflect the user's focus on the element compared with other interaction events; the element information is preferentially selected according to the click event type, and compared to other attributes, the available Element information; and then extract candidate elements according to the button, link, content length and page of the element, which is beneficial to improve the extraction efficiency of element information.
S203、根据元素被触发的次数指标,从候选元素信息集合中,提取目标元素信息。S203 , extracting target element information from the candidate element information set according to the index of the number of times the element is triggered.
本操作包括以下三种可选实施方式。This operation includes the following three optional implementations.
第一种可选实施方式:计算每个候选元素的转化率,从候选元素信息集合中,将转化率在第一预设范围内的候选元素信息,确定为目标元素信息。A first optional implementation manner: Calculate the conversion rate of each candidate element, and determine, from the candidate element information set, candidate element information whose conversion rate is within a first preset range as target element information.
第二种可选实施方式:计算每个候选元素的点击率,从候选元素信息集合中,将点击率在第二预设范围内的候选元素信息,确定为目标元素信息。The second optional implementation manner: Calculate the click-through rate of each candidate element, and determine candidate element information whose click-through rate is within a second preset range from the candidate element information set as the target element information.
第三种可选实施方式:计算每个候选元素的转化率和点击率,从候选元素信息集合中,将转化率在第一预设范围内的候选元素信息,和点击率在第二预设范围内的候选元素信息,确定为目标元素信息。The third optional implementation manner: Calculate the conversion rate and click rate of each candidate element, from the candidate element information set, the candidate element information with the conversion rate within the first preset range, and the click rate within the second preset range. The candidate element information within the range is determined as the target element information.
本实施例中,转化率和点击率能显著反映用户对元素的聚焦行为,具有明确的行为语义;而转化率和点击率的数值能反映用户对元素的聚焦程度,也反映了行为语义的强度。基于此,可以根据业务需要,确定第一预设范围和/或第二预设范围,选择在第一预设范围内的转化率和/或在第二预设范围内的点击率,从而提取不同语义强度的目标元素信息。In this embodiment, the conversion rate and the click rate can significantly reflect the user's focusing behavior on elements, and have clear behavioral semantics; while the values of the conversion rate and click rate can reflect the user's degree of focus on the element, and also reflect the intensity of behavioral semantics . Based on this, the first preset range and/or the second preset range can be determined according to business needs, and the conversion rate within the first preset range and/or the click rate within the second preset range can be selected to extract Target element information with different semantic strengths.
本实施例以会话为粒度,计算候选元素的点击率和转化率。This embodiment uses the session as the granularity to calculate the click-through rate and the conversion rate of the candidate element.
以会话为粒度为例,获取用户与目标页面交互过程中生成的多个会话,每个会话包括预设时段内被点击类事件触发的元素信息;预设时间段可以是30分钟、40分钟等。在计算转化率时,对每一个候选元素,通过计算元素转化率或点击率给元素打分。对于每一个会话(session),如果该session内出现对某个元素的点击,不管点击次数是多少,那么认为该session内元素转化为1,否则为0。统计每个候选元素信息的转化率为1的session个数,并除以session总数,得到每个候选元素信息的转化率。Taking the session as the granularity as an example, obtain multiple sessions generated during the interaction between the user and the target page. Each session includes the element information triggered by click events within a preset time period; the preset time period can be 30 minutes, 40 minutes, etc. . When calculating the conversion rate, for each candidate element, the element is scored by calculating the element conversion rate or click-through rate. For each session (session), if there is a click on an element in the session, no matter how many clicks are made, then the element in the session is considered to be converted to 1, otherwise it is 0. Count the number of sessions with a conversion rate of 1 for each candidate element information, and divide by the total number of sessions to obtain the conversion rate of each candidate element information.
在计算点击率时,每个候选元素信息在目标页面上的显示次数,可以根据目标页面的跳转次数或刷新次数确定。将包含每个候选元素信息的会话个数,除以对应的候选元素信息在目标页面上的显示次数,得到每个候选元素信息的点击率。When calculating the click-through rate, the number of times each candidate element information is displayed on the target page may be determined according to the number of jumps or refresh times of the target page. Divide the number of sessions containing each candidate element information by the number of times the corresponding candidate element information is displayed on the target page to obtain the click-through rate of each candidate element information.
可选的,在确定目标元素信息之前,将候选元素信息集合中,转化率超过第一预设阈值的元素信息,和/或,点击率超过第二预设阈值的元素信息删除。具体的,第一预设阈值和第二预设阈值可以自主设定,对每个候选元素信息计算出转化率和/或点击率后,需要剔除掉转化率和/或点击率偏高的元素信息。例如,转化率在5%以上通常属于偏高,这些元素往往是常规触达的元素,面向页面的基础业务,如登录按钮、保存按钮等;这些元素信息难以反映用户的个性化行为语义;通过将次数指标高的候选元素信息删除,能够集中提取反映用户的个性化的行为语义的元素信息。Optionally, before determining the target element information, the element information whose conversion rate exceeds the first preset threshold and/or the element information whose click rate exceeds the second preset threshold in the candidate element information set is deleted. Specifically, the first preset threshold and the second preset threshold can be set independently. After the conversion rate and/or click rate are calculated for each candidate element information, elements with a high conversion rate and/or click rate need to be eliminated. information. For example, a conversion rate of more than 5% is usually on the high side. These elements are often elements that are routinely reached, and are oriented to the basic business of the page, such as the login button, save button, etc. The information of these elements is difficult to reflect the user's personalized behavior semantics; By deleting the candidate element information with a high frequency index, the element information reflecting the user's personalized behavior semantics can be centrally extracted.
本实施例中,属性可以直接从元素信息集合中获取到,而次数指标需要运算得到,优先依据属性对元素信息进行提取,再依据次数指标进行二次提取,能够快速提取目标元素信息并减少运算量。进一步的,以会话为粒度,通过在预设时段内用户对候选元素的点击与否,计算转化率和点击率,避免用户短时间内反复点击候选元素带来的误差,提高转化率和点击率的准确性。In this embodiment, the attributes can be obtained directly from the element information set, and the times index needs to be obtained by calculation. The element information is first extracted according to the attribute, and then the secondary extraction is carried out according to the times index, which can quickly extract the target element information and reduce the calculation. quantity. Further, with the session as the granularity, the conversion rate and click rate are calculated by whether the user clicks on the candidate element within a preset time period, so as to avoid errors caused by the user repeatedly clicking the candidate element in a short period of time, and improve the conversion rate and click rate. accuracy.
在上述各实施例中,在根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息之后,还包括:根据目标元素信息,对待埋点页面进行埋点。具体的,采用代码埋点或者可视化埋点对待埋点页面上的目标元素信息进行埋点,则可以通过埋点,采集面向业务的、能明确体现用户行为语义的元素信息。In each of the above embodiments, after extracting the target element information from the element information set according to the attribute of the element associated with the page business and the index of the number of times the element is triggered, the method further includes: according to the target element information, perform an operation on the page to be buried. Bury. Specifically, if code embedding or visual embedding is used to bury the target element information on the page to be embedded, business-oriented element information that can clearly reflect the semantics of user behavior can be collected through embedding.
本实施例根据目标元素信息,对待埋点页面进行埋点,以准确提取待埋点页面上,面向业务的、能明确体现用户行为语义的元素信息,而且这种埋点方法适用于初级用户,或者非专业分析师的用户,技术门槛低,从元素信息集合出发,填补元素信息集合和用户设置埋点之间的鸿沟,产出具有业务意义的埋点推荐结果。In this embodiment, according to the target element information, the page to be embedded is embedded, so as to accurately extract the business-oriented element information on the page to be embedded that can clearly reflect the semantics of user behavior, and this embedding method is suitable for primary users. Or users of non-professional analysts, with low technical thresholds, start from the element information collection, fill the gap between the element information collection and user-set embedding points, and produce business-meaningful embedding point recommendation results.
实施例三Embodiment 3
图3是本申请实施例三中的一种页面元素信息的提取装置的结构图,本申请实施例适用于从用户交互行为所触发的页面元素信息中,提取对产品和运营优化的应用价值的元素信息的情况,该装置采用软件和/或硬件实现,并具体配置于具备一定数据运算能力的电子设备中。FIG. 3 is a structural diagram of a device for extracting page element information in Embodiment 3 of the present application. The embodiment of the present application is suitable for extracting the application value of product and operation optimization from page element information triggered by user interaction behavior. In the case of element information, the device is implemented by software and/or hardware, and is specifically configured in an electronic device with a certain data computing capability.
如图3所示的一种页面元素信息的提取装置300,包括:采集模块301和提取模块302;其中,As shown in FIG. 3, an
采集模块301,用于采集目标页面上被用户交互事件触发的元素信息集合。The
提取模块302,用于根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息。The
本申请实施例中,通过采集目标页面上被用户交互事件触发的元素信息集合,获取全部的用户行为数据;通过根据元素的与页面业务关联的属性,和元素被触发的次数指标,从元素信息集合中提取目标元素信息,从而提取出面向业务的、能明确体现用户行为语义的元素信息,以提高元素信息对产品和运营优化的应用价值。In this embodiment of the present application, all user behavior data is obtained by collecting the set of element information on the target page that is triggered by user interaction events; through the attribute of the element associated with the page business and the index of the number of times the element is triggered, the The target element information is extracted from the collection, so as to extract the business-oriented element information that can clearly reflect the semantics of user behavior, so as to improve the application value of the element information for product and operation optimization.
进一步的,元素的与页面业务关联的属性,包括:元素绑定的点击类事件、元素的按钮类型、元素的链接功能、小于设定值的元素内容长度、元素位于落地页和元素位于推广页面中的至少一种;Further, the attributes of the element associated with the page business include: the click event bound to the element, the button type of the element, the link function of the element, the content length of the element less than the set value, the element on the landing page and the element on the promotion page. at least one of;
元素被触发的次数指标,包括:元素的转化率和/或点击率。Metrics for the number of times the element was triggered, including: the element's conversion rate and/or click-through rate.
进一步的,提取模块302包括候选元素信息集合提取单元和目标元素信息提取单元。候选元素信息集合提取单元用于根据元素的与页面业务关联的属性,从元素信息集合中提取候选元素信息集合;目标元素信息提取单元用于根据元素被触发的次数指标,从候选元素信息集合中,提取目标元素信息。Further, the
进一步的,候选元素信息集合提取单元具体用于从元素信息集合中,提取被点击类事件触发的元素信息;从被点击类事件触发的元素信息中,提取按钮类型的元素信息、链接类型的元素信息、内容长度小于设定值的元素信息、位于落地页的元素信息和位于推广页面上的元素信息中的至少一种,构成候选元素信息集合。Further, the candidate element information set extraction unit is specifically used to extract the element information triggered by the click event from the element information set; from the element information triggered by the click event, extract the element information of the button type and the element of the link type. At least one of the information, the element information whose content length is less than the set value, the element information located on the landing page, and the element information located on the promotion page, constitutes a candidate element information set.
进一步的,目标元素信息提取单元具体用于计算每个候选元素的转化率和/或点击率;从候选元素信息集合中,将转化率在第一预设范围内的候选元素信息,和/或,点击率在第二预设范围内的候选元素信息,确定为目标元素信息。Further, the target element information extraction unit is specifically used to calculate the conversion rate and/or click rate of each candidate element; from the candidate element information set, the candidate element information whose conversion rate is within the first preset range, and/or , and the candidate element information whose click-through rate is within the second preset range is determined as the target element information.
进一步的,目标元素信息提取单元在计算每个候选元素的转化率和/或点击率时,具体用于:获取用户与目标页面交互过程中生成的多个会话,每个会话包括预设时段内被点击类事件触发的元素信息;根据包含每个候选元素信息的会话个数,以及多个会话的总数,计算每个候选元素的转化率;和/或,根据包含每个候选元素信息的会话个数,以及每个候选元素信息在目标页面上的显示次数,计算每个候选元素的点击率。Further, when calculating the conversion rate and/or click rate of each candidate element, the target element information extraction unit is specifically used for: acquiring multiple sessions generated during the interaction between the user and the target page, and each session includes a preset time period. Element information triggered by a click event; calculate the conversion rate of each candidate element according to the number of sessions containing information about each candidate element and the total number of sessions; and/or, according to the session containing information about each candidate element The number of items, and the number of times each candidate element information is displayed on the target page, calculate the click-through rate of each candidate element.
进一步的,该装置还包括删除单元,用于在从候选元素信息集合中,将转化率在第一预设范围内的元素信息,和/或,点击率在第二预设范围内的元素信息,确定为目标元素信息之前,将候选元素信息集合中,转化率超过第一预设阈值的元素信息,和/或,点击率超过第二预设阈值的元素信息删除。Further, the device also includes a deletion unit, which is used for, from the candidate element information set, element information whose conversion rate is within a first preset range, and/or element information whose click rate is within a second preset range , before determining the target element information, delete the element information whose conversion rate exceeds the first preset threshold and/or the element information whose click rate exceeds the second preset threshold in the candidate element information set.
进一步的,该装置还包括埋点模块,其中,埋点模块用于根据目标元素信息,对待埋点页面进行埋点。Further, the device further includes a burying module, wherein the burying module is used for burying the page to be embedded according to the target element information.
上述页面元素信息的提取装置可执行本申请任意实施例所提供的页面元素信息的提取方法,具备执行页面元素信息的提取方法相应的功能模块和有益效果。The above apparatus for extracting page element information can execute the method for extracting page element information provided by any embodiment of the present application, and has functional modules and beneficial effects corresponding to executing the method for extracting page element information.
实施例四Embodiment 4
根据本申请的实施例,本申请还提供了一种电子设备和一种可读存储介质。According to the embodiments of the present application, the present application further provides an electronic device and a readable storage medium.
如图4所示,是实现本申请实施例的页面元素信息的提取方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 4 , it is a block diagram of an electronic device implementing the method for extracting page element information according to the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.
如图4所示,该电子设备包括:一个或多个处理器401、存储器402,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图4中以一个处理器401为例。As shown in FIG. 4, the electronic device includes: one or
存储器402即为本申请所提供的非瞬时计算机可读存储介质。其中,所述存储器存储有可由至少一个处理器执行的指令,以使所述至少一个处理器执行本申请所提供的页面元素信息的提取方法。本申请的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本申请所提供的页面元素信息的提取方法。The
存储器402作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本申请实施例中的页面元素信息的提取的方法对应的程序指令/模块(例如,附图3所示的包括采集模块301和提取模块302)。处理器401通过运行存储在存储器402中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的页面元素信息的提取的方法。As a non-transitory computer-readable storage medium, the
存储器402可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储实现页面元素信息的提取方法的电子设备的使用所创建的数据等。此外,存储器402可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器402可选包括相对于处理器401远程设置的存储器,这些远程存储器可以通过网络连接至执行页面元素信息的提取方法的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The
执行页面元素信息的提取方法的电子设备还可以包括:输入装置403和输出装置404。处理器401、存储器402、输入装置403和输出装置404可以通过总线或者其他方式连接,图4中以通过总线连接为例。The electronic device performing the method for extracting page element information may further include: an
输入装置403可接收输入的数字或字符信息,以及产生与执行页面元素信息的提取方法的电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置404可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。The
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)、互联网和区块链网络。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), the Internet, and blockchain networks.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
根据本申请实施例的技术方案,通过采集目标页面上被用户交互事件触发的元素信息集合,获取全部的用户行为数据;通过根据元素的与页面业务关联的属性,和元素被触发的次数指标,从所述元素信息集合中提取目标元素信息,从而提取出面向业务的、能明确体现用户行为语义的元素信息,以提高元素信息对产品和运营优化的应用价值。According to the technical solutions of the embodiments of the present application, all user behavior data is obtained by collecting the set of element information on the target page that is triggered by user interaction events; The target element information is extracted from the element information set, so as to extract business-oriented element information that can clearly reflect the semantics of user behavior, so as to improve the application value of the element information for product and operation optimization.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be performed in parallel, sequentially or in different orders, and as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.
Claims (11)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010093390.7A CN111310044B (en) | 2020-02-14 | 2020-02-14 | Method, device, equipment and storage medium for extracting page element information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010093390.7A CN111310044B (en) | 2020-02-14 | 2020-02-14 | Method, device, equipment and storage medium for extracting page element information |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111310044A true CN111310044A (en) | 2020-06-19 |
| CN111310044B CN111310044B (en) | 2023-09-26 |
Family
ID=71161722
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010093390.7A Active CN111310044B (en) | 2020-02-14 | 2020-02-14 | Method, device, equipment and storage medium for extracting page element information |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111310044B (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112596837A (en) * | 2020-12-24 | 2021-04-02 | 瑞声新能源发展(常州)有限公司科教城分公司 | Method and device for replacing mobile terminal theme and readable storage medium |
| CN113065600A (en) * | 2021-04-08 | 2021-07-02 | 易联众信息技术股份有限公司 | Page element classification method, parser, medium and device |
| CN114626364A (en) * | 2022-01-26 | 2022-06-14 | 科大讯飞股份有限公司 | Content extraction method and related device, electronic device, storage medium |
| CN115017201A (en) * | 2022-08-09 | 2022-09-06 | 中企云链(北京)金融信息服务有限公司 | FLINK processing engine-based user behavior analysis method and system |
Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105447139A (en) * | 2015-11-20 | 2016-03-30 | 广州华多网络科技有限公司 | Data acquisition statistical method, and system, terminal and service equipment thereof |
| US20170013072A1 (en) * | 2015-07-09 | 2017-01-12 | Guangzhou Ucweb Computer Technology Co., Ltd. | Webpage pre-reading method, apparatus and smart terminal device |
| US20170236073A1 (en) * | 2016-02-12 | 2017-08-17 | Linkedln Corporation | Machine learned candidate selection on inverted indices |
| US20170315676A1 (en) * | 2016-04-28 | 2017-11-02 | Linkedln Corporation | Dynamic content insertion |
| CN108038053A (en) * | 2017-11-29 | 2018-05-15 | 上海恺英网络科技有限公司 | A kind of dynamic configuration buries method and apparatus a little |
| CN108334525A (en) * | 2017-01-20 | 2018-07-27 | 阿里巴巴集团控股有限公司 | A kind of method for exhibiting data and device |
| CN108459845A (en) * | 2018-03-14 | 2018-08-28 | 北京思特奇信息技术股份有限公司 | A kind of surveillance tag attribute buries point methods and device |
| CN109145230A (en) * | 2017-06-15 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Information output method and device |
| CN109522191A (en) * | 2018-10-16 | 2019-03-26 | 深圳壹账通智能科技有限公司 | A kind of method and device of the attribute information of acquisition interbehavior instruction |
| CN109740089A (en) * | 2018-11-30 | 2019-05-10 | 东软集团股份有限公司 | Collecting method, device, system, readable storage medium storing program for executing and electronic equipment |
| CN109767259A (en) * | 2018-12-15 | 2019-05-17 | 深圳壹账通智能科技有限公司 | Method, device, equipment and medium for operation promotion promotion based on buried point data |
| CN110378732A (en) * | 2019-07-18 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Information display method, information correlation method, device, equipment and storage medium |
| CN110674022A (en) * | 2019-09-27 | 2020-01-10 | 北京三快在线科技有限公司 | Behavior data acquisition method and device and storage medium |
-
2020
- 2020-02-14 CN CN202010093390.7A patent/CN111310044B/en active Active
Patent Citations (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170013072A1 (en) * | 2015-07-09 | 2017-01-12 | Guangzhou Ucweb Computer Technology Co., Ltd. | Webpage pre-reading method, apparatus and smart terminal device |
| CN105447139A (en) * | 2015-11-20 | 2016-03-30 | 广州华多网络科技有限公司 | Data acquisition statistical method, and system, terminal and service equipment thereof |
| US20170236073A1 (en) * | 2016-02-12 | 2017-08-17 | Linkedln Corporation | Machine learned candidate selection on inverted indices |
| US20170315676A1 (en) * | 2016-04-28 | 2017-11-02 | Linkedln Corporation | Dynamic content insertion |
| CN108334525A (en) * | 2017-01-20 | 2018-07-27 | 阿里巴巴集团控股有限公司 | A kind of method for exhibiting data and device |
| CN109145230A (en) * | 2017-06-15 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Information output method and device |
| CN108038053A (en) * | 2017-11-29 | 2018-05-15 | 上海恺英网络科技有限公司 | A kind of dynamic configuration buries method and apparatus a little |
| CN108459845A (en) * | 2018-03-14 | 2018-08-28 | 北京思特奇信息技术股份有限公司 | A kind of surveillance tag attribute buries point methods and device |
| CN109522191A (en) * | 2018-10-16 | 2019-03-26 | 深圳壹账通智能科技有限公司 | A kind of method and device of the attribute information of acquisition interbehavior instruction |
| CN109740089A (en) * | 2018-11-30 | 2019-05-10 | 东软集团股份有限公司 | Collecting method, device, system, readable storage medium storing program for executing and electronic equipment |
| CN109767259A (en) * | 2018-12-15 | 2019-05-17 | 深圳壹账通智能科技有限公司 | Method, device, equipment and medium for operation promotion promotion based on buried point data |
| CN110378732A (en) * | 2019-07-18 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Information display method, information correlation method, device, equipment and storage medium |
| CN110674022A (en) * | 2019-09-27 | 2020-01-10 | 北京三快在线科技有限公司 | Behavior data acquisition method and device and storage medium |
Non-Patent Citations (1)
| Title |
|---|
| 余邹蓓蕾;: "互联网产品运营设计分析研究", 工业设计研究, no. 00 * |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112596837A (en) * | 2020-12-24 | 2021-04-02 | 瑞声新能源发展(常州)有限公司科教城分公司 | Method and device for replacing mobile terminal theme and readable storage medium |
| CN112596837B (en) * | 2020-12-24 | 2024-05-17 | 瑞声新能源发展(常州)有限公司科教城分公司 | Mobile terminal theme changing method and device and readable storage medium |
| CN113065600A (en) * | 2021-04-08 | 2021-07-02 | 易联众信息技术股份有限公司 | Page element classification method, parser, medium and device |
| CN114626364A (en) * | 2022-01-26 | 2022-06-14 | 科大讯飞股份有限公司 | Content extraction method and related device, electronic device, storage medium |
| CN115017201A (en) * | 2022-08-09 | 2022-09-06 | 中企云链(北京)金融信息服务有限公司 | FLINK processing engine-based user behavior analysis method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111310044B (en) | 2023-09-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111310044B (en) | Method, device, equipment and storage medium for extracting page element information | |
| CN112084150B (en) | Model training, data retrieval method, device, equipment and storage medium | |
| CN111522967A (en) | Knowledge graph construction method, device, device and storage medium | |
| US11526575B2 (en) | Web browser with enhanced history classification | |
| CN109804368A (en) | System and method for providing contextual information | |
| CN111858905B (en) | Model training method, information recognition method, device, electronic device and storage medium | |
| CN112000863B (en) | User behavior data analysis method, device, equipment and medium | |
| US11003667B1 (en) | Contextual information for a displayed resource | |
| CN111460289A (en) | News information push method and device | |
| CN111447507B (en) | Video production method and device, electronic equipment and storage medium | |
| CN105930527A (en) | Searching method and device | |
| JP2024507902A (en) | Information retrieval methods, devices, electronic devices and storage media | |
| CN109791545B (en) | Context information for resources that are displayed including images | |
| CN112052410A (en) | Map point of interest update method and device | |
| CN110888926A (en) | Method and device for structuring medical text | |
| CN113656737A (en) | Web page content display method, device, electronic device and storage medium | |
| CN111414455B (en) | Public opinion analysis method, public opinion analysis device, electronic equipment and readable storage medium | |
| CN117932036A (en) | Dialogue processing method, device, electronic device and storage medium | |
| CN111488510B (en) | Method and device for determining related words of applet, processing equipment and search system | |
| CN114519153B (en) | Webpage history record display method, device, equipment and storage medium | |
| US20220398291A1 (en) | Smart browser history search | |
| CN112540904B (en) | Machine operation behavior recognition method, device, electronic device and computer medium | |
| CN113377922B (en) | Methods, devices, electronic devices and media for matching information | |
| CN114595391B (en) | Data processing method, device and electronic device based on information search | |
| CN112101012B (en) | Interaction area determination method, device, electronic device and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |