[go: up one dir, main page]

CN104063394A - Method and device for determining target page as well as equipment - Google Patents

Method and device for determining target page as well as equipment Download PDF

Info

Publication number
CN104063394A
CN104063394A CN201310092363.8A CN201310092363A CN104063394A CN 104063394 A CN104063394 A CN 104063394A CN 201310092363 A CN201310092363 A CN 201310092363A CN 104063394 A CN104063394 A CN 104063394A
Authority
CN
China
Prior art keywords
information
webpage
candidate
released
webpages
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310092363.8A
Other languages
Chinese (zh)
Other versions
CN104063394B (en
Inventor
吴晨
江潇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201310092363.8A priority Critical patent/CN104063394B/en
Publication of CN104063394A publication Critical patent/CN104063394A/en
Application granted granted Critical
Publication of CN104063394B publication Critical patent/CN104063394B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明的目的是提供一种用于确定目标网页的方法、装置和设备。根据本发明的方法,包括:获取待发布信息及其对应的多个候选网页;获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度;根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。本发明的优点在于:可自动确定与待发布信息相匹配的网页,无需人工进行设置,从而提高了操作效率,并且,当网页内容发生变换后,计算机设备还可根据变换后的网页来执行本发明的方案,以重新确定与待发布信息对应的目标网页,因此根据本发明的方案,可自动、高效的为待发布信息确定与其匹配度较高的目标网页。

The purpose of the present invention is to provide a method, device and equipment for determining a target webpage. According to the method of the present invention, it includes: acquiring the information to be published and its corresponding multiple candidate webpages; acquiring the matching degree between the information to be published and each candidate webpage among the multiple candidate webpages; according to the information to be published According to the degree of matching with each of the candidate webpages, one or more candidate webpages are selected as the target webpage corresponding to the information to be released. The present invention has the advantages that: the webpage matching the information to be released can be automatically determined without manual setting, thereby improving the operation efficiency; moreover, when the content of the webpage is transformed, the computer device can also execute the webpage according to the transformed webpage The solution of the invention is to re-determine the target webpage corresponding to the information to be released. Therefore, according to the solution of the present invention, the target webpage with a high matching degree can be automatically and efficiently determined for the information to be released.

Description

一种用于确定目标网页的方法、装置和设备A method, device and device for determining a target webpage

技术领域technical field

本发明涉及计算机技术领域,尤其涉及一种用于确定目标网页的方法、装置和设备。The present invention relates to the field of computer technology, in particular to a method, device and equipment for determining a target webpage.

背景技术Background technique

在互联网营销中,着陆页(Landing Page,有时被称为首要捕获用户页)就是当潜在用户点击广告或者利用搜索引擎搜索后显示给用户的网页。In Internet marketing, a landing page (sometimes referred to as a primary capture user page) is a web page that is displayed to a potential user after they click on an ad or perform a search on a search engine.

现有技术中,通常仅根据预先设定的广告与网页的对应关系来确定着陆页。这种方式的问题在于,通常仅将广告与网站的首页相对应,从而点击广告的用户不能直接获得与广告内容直接相关的信息,然而,要为各个广告均设定针对性的网页,则需要极大的工作量,并且对所设定的对应关系进行调整也需要花费较多时间与精力。In the prior art, the landing page is usually determined only according to the preset correspondence between the advertisement and the webpage. The problem with this method is that usually only the advertisement is corresponding to the homepage of the website, so that users who click on the advertisement cannot directly obtain information directly related to the content of the advertisement. However, to set a targeted webpage for each advertisement, you need It is a huge workload, and it takes a lot of time and energy to adjust the set corresponding relationship.

发明内容Contents of the invention

本发明的目的是提供一种用于确定目标网页的方法、装置和设备。The purpose of the present invention is to provide a method, device and equipment for determining a target webpage.

根据本发明的一个方面,提供一种计算机设备实现的用于确定目标网页的方法,其中,所述方法包括以下步骤:According to one aspect of the present invention, a method for determining a target webpage implemented by a computer device is provided, wherein the method includes the following steps:

a获取待发布信息及其对应的多个候选网页;a Obtaining information to be released and multiple candidate webpages corresponding thereto;

b获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度;b obtaining the matching degree between the information to be published and each candidate webpage in the plurality of candidate webpages;

c根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。c. Selecting one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages.

根据本发明的另一个方面,还提供了一种用于确定目标网页的网页确定装置,其中,所述网页确定装置包括:According to another aspect of the present invention, there is also provided a web page determining device for determining a target web page, wherein the web page determining device includes:

第一获取装置,用于获取待发布信息及其对应的多个候选网页;a first acquiring device, configured to acquire information to be released and a plurality of corresponding candidate webpages;

第二获取装置,用于获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度;The second obtaining means is used to obtain the matching degree between the information to be published and each candidate webpage among the plurality of candidate webpages;

选择装置,用于根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。The selection means is used for selecting one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages.

根据本发明的另一个方面,还提供了一种计算机设备,其中,所述计算机设备包括所述网页确定装置。According to another aspect of the present invention, there is also provided a computer device, wherein the computer device includes the webpage determining means.

与现有技术相比,本发明具有以下优点:计算机设备可自动确定与待发布信息相匹配的网页,无需人工进行设置,从而提高了操作效率,并且,当网页内容发生变换后,计算机设备还可根据变换后的网页来执行本发明的方案,以重新确定与待发布信息对应的目标网页,因此根据本发明的方案,可自动、高效的为待发布信息确定与其匹配度较高的目标网页,提高了实现效率。Compared with the prior art, the present invention has the following advantages: the computer equipment can automatically determine the webpage matching the information to be released, without manual setting, thereby improving the operation efficiency, and, when the content of the webpage changes, the computer equipment can also The scheme of the present invention can be executed according to the converted webpage to re-determine the target webpage corresponding to the information to be released. Therefore, according to the scheme of the present invention, the target webpage with a high matching degree can be automatically and efficiently determined for the information to be released , which improves the implementation efficiency.

附图说明Description of drawings

通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本发明的其它特征、目的和优点将会变得更明显:Other characteristics, objects and advantages of the present invention will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:

图1为根据本发明的一个方面的一种计算机设备实现的用于确定目标网页的方法流程图;Fig. 1 is a flow chart of a method for determining a target webpage implemented by a computer device according to an aspect of the present invention;

图2为根据本发明的一个方面的一种用于确定目标网页的网页确定装置的结构示意图;FIG. 2 is a schematic structural diagram of a web page determining device for determining a target web page according to an aspect of the present invention;

附图中相同或相似的附图标记代表相同或相似的部件。The same or similar reference numerals in the drawings represent the same or similar components.

具体实施方式Detailed ways

下面结合附图对本发明作进一步详细描述。The present invention will be described in further detail below in conjunction with the accompanying drawings.

图1示意出了根据本发明的一个方面的一种计算机设备实现的用于确定目标网页的方法流程图。根据本发明的方法包括步骤S1、步骤S2和步骤S3。Fig. 1 schematically shows a flowchart of a method for determining a target webpage implemented by a computer device according to an aspect of the present invention. The method according to the invention comprises steps S1, S2 and S3.

其中,根据本发明的方法通过计算机设备实现。所述计算机设备包括一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的电子设备,其硬件包括但不限于微处理器、专用集成电路(ASIC)、可编程门阵列(FPGA)、数字处理器(DSP)、嵌入式设备等。其中,所述计算机设备包括网络设备和用户设备。所述网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算(Cloud Computing)的由大量主机或网络服务器构成的云,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。所述用户设备包括但不限于任何一种可与用户通过键盘、鼠标、遥控器、触摸板、或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、PDA、游戏机、或IPTV等。其中,所述用户设备及网络设备所处的网络包括但不限于互联网、广域网、城域网、局域网、VPN网络等。Wherein, the method according to the present invention is realized by computer equipment. The computer equipment includes an electronic equipment that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but not limited to microprocessors, application-specific integrated circuits (ASICs), programmable gates Arrays (FPGA), digital processors (DSP), embedded devices, etc. Wherein, the computer equipment includes network equipment and user equipment. The network equipment includes but is not limited to a single network server, a server group composed of multiple network servers, or a cloud based on cloud computing (Cloud Computing) composed of a large number of hosts or network servers, wherein cloud computing is a kind of distributed computing , a super virtual computer consisting of a group of loosely coupled computer sets. The user equipment includes but is not limited to any electronic product that can interact with the user through a keyboard, mouse, remote control, touch pad, or voice-activated device, such as a personal computer, tablet computer, smart phone, PDA , game console, or IPTV, etc. Wherein, the network where the user equipment and the network equipment are located includes but is not limited to the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.

需要说明的是,所述用户设备、网络设备以及网络仅为举例,其他现有的或今后可能出现的用户设备、网络设备以及网络如可适用于本发明,也应包含在本发明保护范围以内,并以引用方式包含于此。It should be noted that the user equipment, network equipment, and network described above are only examples, and other existing or future user equipment, network equipment, and networks that are applicable to the present invention should also be included within the protection scope of the present invention , and is incorporated herein by reference.

具体地,参照图1,在步骤S1中,计算机设备获取待发布信息及其对应的多个候选网页。Specifically, referring to FIG. 1 , in step S1 , the computer device acquires information to be published and a plurality of corresponding candidate webpages.

其中,所述待发布信息包括用户希望发布至互联网的各类信息,包括但不限于文本、网页、多媒体等。Wherein, the information to be published includes various types of information that the user wishes to publish on the Internet, including but not limited to text, webpage, multimedia and so on.

其中,所述计算机设备可直接根据用户输入来确定待发布信息,或者,根据一个或多个其他用户所使用的待发布信息来确定待发布信息。例如,计算机设备获取被其他用户使用最多的前n个待发布信息,作为自身的待发布信息等。Wherein, the computer device may directly determine the information to be released according to user input, or determine the information to be released according to the information to be released used by one or more other users. For example, the computer device acquires the top n pieces of information to be released that are most used by other users as its own information to be released.

具体地,计算机设备根据预定网页范围来确定与待发布信息对应的候选网页。其中,所述预定网页范围包括但不限于预设的至少一个网页的网页链接信息。Specifically, the computer device determines a candidate webpage corresponding to the information to be released according to a predetermined range of webpages. Wherein, the predetermined range of webpages includes but not limited to preset webpage link information of at least one webpage.

优选地,所述预定网页范围包括所述与预设的网页链接信息的所属网站所包含的所有网页。Preferably, the predetermined range of webpages includes all webpages included in the website to which the preset webpage link information belongs.

例如,预设的网页链接信息包括:http://www.abc.com/page1.html,则预定网页范围包括与域名“http://www.abc.com/”对应的网站所包含的全部网页。For example, the preset web page link information includes: http://www.abc.com/page1.html, then the predetermined web page range includes all the information contained in the website corresponding to the domain name "http://www.abc.com/". Web page.

接着,在步骤S2中,计算机设备获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度。Next, in step S2, the computer device acquires the matching degree between the information to be released and each candidate webpage among the plurality of candidate webpages.

具体地,计算机设备根据待发布信息与所述各个候选网页的网页内容间的相似度,和/或各个候选网页分别在各自所属网站的站内搜索结果中的排名信息,来确定待发布信息与该多个候选网页中的各个候选网页之间的匹配度。Specifically, the computer device determines the information to be released and the content of the candidate webpage according to the similarity between the information to be released and the webpage content of each candidate webpage, and/or the ranking information of each candidate webpage in the search results of the respective websites. Matching degrees among candidate webpages among the plurality of candidate webpages.

其中,所述计算机设备获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度的方式包括但不限于以下任一种:Wherein, the way for the computer device to obtain the matching degree between the information to be released and each candidate webpage in the plurality of candidate webpages includes but is not limited to any of the following:

1)获取所述待发布信息与所述各个候选网页的网页内容间的相似度;根据所述相似度来确定该待发布信息与各个候选网页的匹配度。1) Obtaining the degree of similarity between the information to be released and the webpage content of each candidate webpage; determining the degree of matching between the information to be released and each candidate webpage according to the similarity.

其中,所述网页内容包括但不限于以下至少任一项:Wherein, the webpage content includes but is not limited to at least any of the following:

a)锚文本信息;a) anchor text information;

b)网页主题信息;b) Web page theme information;

c)网页正文信息。c) Web page text information.

具体地,获取所述待发布信息与所述各个候选网页的网页内容间的相似度的方式包括但不限于以下任一种:Specifically, the manner of obtaining the similarity between the information to be released and the webpage content of each candidate webpage includes but is not limited to any of the following:

a)利用文本挖掘技术来分析待发布信息与各个候选网页的网页内容间的相似度。a) Using text mining technology to analyze the similarity between the information to be released and the web content of each candidate web page.

例如,计算待发布信息相对于候选网页的网页内容信息的IF-IDF值,并将其作为相似度信息等。For example, calculate the IF-IDF value of the information to be released relative to the webpage content information of the candidate webpage, and use it as similarity information.

b)计算机设备计算所述待发布信息与所述候选网页的网页内容之间的编辑距离信息,以根据所述编辑距离信息来确定相似度信息。b) The computer device calculates edit distance information between the information to be published and the webpage content of the candidate webpage, so as to determine similarity information according to the edit distance information.

优选地,根据所述待发布信息与候选网页中的一个或多个锚文本信息之间的编辑距离来确定相似度。Preferably, the similarity is determined according to the edit distance between the information to be published and one or more anchor text information in the candidate webpage.

更优选地,计算机设备获取待发布信息相对于候选网页的多个锚文本之间的编辑距离,并基于该多个编辑距离来确定该待发布信息相对于候选网页的相似度。More preferably, the computer device obtains edit distances between multiple anchor texts of the information to be released relative to the candidate webpages, and determines the similarity of the information to be released relative to the candidate webpages based on the multiple edit distances.

例如,先根据多个编辑距离获取待发布信息相对于候选网页的网页内容的平均编辑距离,并基于预定的编辑距离与匹配度的转换规则,确定相应的匹配度等级。其中所述转换规则可以为预定的对应关系,也可以为预定的转换函数。For example, the average edit distance of the information to be released relative to the content of the candidate webpage is first obtained according to multiple edit distances, and the corresponding match degree level is determined based on a predetermined conversion rule between edit distance and matching degree. The conversion rule may be a predetermined correspondence, or a predetermined conversion function.

2)当所述多个候选网页中的部分或全部候选网页属于同一网站时,计算机设备通过与该网站通信,以获取该网站基于所述待发布信息在其网站内进行查询的查询结果,其中包含所述部分或全部候选网页的排序信息;接着,计算机设备根据所述查询结果中所包含的所述部分或全部候选网页排序信息,确定所述待发布信息与该部分或全部候选网页中的各个候选网页之间的匹配度。2) When some or all of the multiple candidate webpages belong to the same website, the computer device communicates with the website to obtain the query result of the website's query within its website based on the information to be released, wherein Include the sorting information of the part or all of the candidate web pages; then, the computer device determines the information to be released and the part or all of the candidate web pages according to the sorting information of the part or all of the candidate web pages contained in the query result The degree of matching between individual candidate web pages.

根据本发明的第一示例,待发布信息Info1对应的3个候选网页page1、page2以及page3均包含于网站Site1中,则计算机设备与该Site1进行通信,以获得网站Site基于Info1在网站内部进行搜索的搜索结果,其中候选网页page1、page2以及page3在搜索结果中的排名分别为第2位、第5位以及第10位,则计算机设备该3个候选网页各自的搜索结果排名,确定该3个候选网页与待发布信息Info1之间的匹配度排名从高到低依次为候选网页page3、page2和page1。According to the first example of the present invention, the three candidate webpages page1, page2, and page3 corresponding to the information to be published are all included in the website Site1, and then the computer device communicates with the Site1 to obtain the website Site based on Info1 and search within the website The search results of the candidate webpages page1, page2, and page3 are ranked 2nd, 5th, and 10th in the search results respectively, then the rankings of the search results of the three candidate webpages are determined by the computer device, and the three candidate webpages are determined. The rankings of the matching degree between the candidate webpage and the information to be published are page3, page2 and page1 in order from high to low.

优选地,计算机设备可根据待发布信息与所述各个候选网页的网页内容间的相似度,以及各个候选网页分别在各自所属网站的站内搜索结果中的排名信息,确定待发布信息与该各个候选网页的匹配度信息。Preferably, the computer device can determine the information to be released and each candidate webpage according to the similarity between the information to be released and the webpage content of each candidate webpage, and the ranking information of each candidate webpage in the search results of the respective websites. Matching information for web pages.

根据本发明的第二示例,待发布信息Info2对应的5个候选网页中,page4至page6属于网站Site2,page7和page8属于网站Site3。计算机设备根据Info3相对于该5个候选网页的编辑距离依次分别为1、2、1、2、4;并且,计算机设备通过与网站Site2的通信,获得网站Site2基于Info2进行站内搜索后共获得50项结果,其中,page4、page5以及page6分别位于第3位、第4位以及第9位;计算机设备通过与网站Site3的通信,获得网站Site3基于Info2进行站内搜索后共获得25项结果,其中,page7、page8分别位于第1位和第5位。According to the second example of the present invention, among the five candidate webpages corresponding to the information to be released Info2, page4 to page6 belong to the website Site2, and page7 and page8 belong to the website Site3. According to Info3, the editing distances of the computer equipment relative to the five candidate webpages are respectively 1, 2, 1, 2, and 4; and, through communication with the website Site2, the computer equipment obtains a total of 50 webpages based on the site search of the website Site2. item results, among which, page4, page5, and page6 are located at the 3rd, 4th, and 9th positions respectively; through the communication between the computer equipment and the website Site3, a total of 25 items of results are obtained after the website Site3 conducts an in-site search based on Info2, among which, page7 and page8 are located at the 1st and 5th positions respectively.

接着,计算机设备根据预定的匹配度计算公式来确定待发布信息相对于各个候选网页的匹配度,该公式如下所示:Next, the computer device determines the matching degree of the information to be released relative to each candidate webpage according to a predetermined matching degree calculation formula, the formula is as follows:

匹配度=(1/编辑距离)×(1-搜索结果排名/搜索结果总项数);Matching degree=(1/editing distance)×(1-search result ranking/total number of search result items);

则计算机设备根据该确定Info2相对于候选网页page4至page8的匹配度分别为:0.94、0.46、0.82、0.48、0.2。Then, according to the determination of the computer device, the matching degrees of Info2 relative to the candidate web pages page4 to page8 are respectively: 0.94, 0.46, 0.82, 0.48, and 0.2.

需要说明的是,本说明书中所用到的公式、数值等仅是用于供理解本发明而作的举例,不作为实际应用时的真实数据或公式,也不应当理解为对于本发明的限制。本领域技术人员根据本发明所公开的原理,采用其他公式或数值来确定匹配度的方式,也应当包含于本发明的保护范围内。如无特别说明,本文中其他地方出现的字符串的功用与此处相同,为简明起见,不再赘述。It should be noted that the formulas, numerical values, etc. used in this specification are only examples for understanding the present invention, not as real data or formulas in actual application, and should not be construed as limiting the present invention. According to the principles disclosed in the present invention, those skilled in the art may use other formulas or numerical values to determine the matching degree, which should also be included in the protection scope of the present invention. Unless otherwise specified, the functions of the character strings appearing elsewhere in this article are the same as those here, and will not be repeated for brevity.

需要进一步说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度的实现方式,均应包含在本发明的范围内。It should be further explained that the above examples are only to better illustrate the technical solution of the present invention, rather than limit the present invention. Those skilled in the art should understand that any information to be released and each of the plurality of candidate web pages The implementation of the matching degree between candidate webpages should be included in the scope of the present invention.

接着,在步骤S3中,计算机设备根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。Next, in step S3, the computer device selects one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages.

具体地,计算机设备基于预定选择规则,根据所述待发布信息与所述各个候选网页之间的匹配度来选择一个或多个候选网页作为目标网页。Specifically, the computer device selects one or more candidate webpages as the target webpage according to the matching degree between the information to be published and each candidate webpage based on a predetermined selection rule.

例如,预定选择规则包括选择匹配度排名前m个的候选网页作为目标网页,则计算机设备将各个候选网页按照匹配度信息排名后选取前m个作为目标网页。For example, the predetermined selection rule includes selecting the top m candidate webpages ranked in the matching degree as the target webpage, and then the computer device ranks each candidate webpage according to the matching degree information and selects the top m candidate webpages as the target webpage.

优选地,计算机设备对各个候选网页,判断该候选网页与所述待发布信息之间的匹配度是否达到预定阈值;并当达到预定阈值时,将该候选网页作为所述待发布信息的目标网页。Preferably, for each candidate webpage, the computer device judges whether the matching degree between the candidate webpage and the information to be released reaches a predetermined threshold; and when the predetermined threshold is reached, the candidate webpage is used as the target webpage of the information to be released .

更优选地,当没有候选网页达到所述预定阈值时,计算机设备将预定网页作为所述待发布信息的目标网页。More preferably, when no candidate webpage reaches the predetermined threshold, the computer device uses a predetermined webpage as a target webpage of the information to be published.

其中,所述预定网页包括但不限于用户预设的网页。Wherein, the predetermined webpage includes but not limited to a webpage preset by the user.

需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何计算机设备根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页的实现方式,均应包含在本发明的范围内。It should be noted that the above examples are only to better illustrate the technical solutions of the present invention, and are not intended to limit the present invention. Those skilled in the art should understand that any computer device can The matching degree and the implementation of selecting one or more candidate webpages as the target webpage corresponding to the information to be published shall be included in the scope of the present invention.

作为本发明的优选实施例之一,其中,所述方法还包括步骤S4(图未示)和步骤S5(图未示)。As one of the preferred embodiments of the present invention, the method further includes step S4 (not shown in the figure) and step S5 (not shown in the figure).

在步骤S4中,计算机设备根据所述待发布信息及其对应的至少一个目标网页,生成一个或多个网络发布信息。In step S4, the computer device generates one or more web publishing information according to the information to be published and at least one corresponding target web page.

其中,各个网络发布信息分别包含其所对应的目标网页的指示信息。所述指示信息包括但不限于指向所述目标网页的网页链接信息。Wherein, each network publishing information includes indication information of its corresponding target web page. The indication information includes but not limited to web page link information pointing to the target web page.

优选地,网络发布信息包括但不限于可在网页中以不同形式呈现的各类信息。Preferably, the information published on the network includes but is not limited to various types of information that can be presented in different forms on the webpage.

例如,当待发布信息包括关键词“苹果”时,网络发布信息可包括嵌入在网页中的包含该“评估”关键词的URL,或者,网络发布信息可包括嵌入在网页中的以“苹果”为主题的图片等。For example, when the information to be released includes the keyword "apple", the web publishing information may include a URL embedded in the web page containing the "evaluation" keyword, or the web publishing information may include the URL embedded in the web page starting with "apple". themed pictures etc.

继续对前述第二示例进行说明,其中,计算机设备根据获取最匹配的网页的预定选择规则,确定与待发布信息Info2对应的目标网页为page4,则计算机设备生成以Info2的内容为锚文本,并指向page4的超链接信息Link1。Continuing to explain the second example above, wherein the computer device determines that the target web page corresponding to the information to be published is page4 according to the predetermined selection rule for obtaining the most matching web page, then the computer device generates the anchor text with the content of Info2 as the anchor text, and Hyperlink information Link1 pointing to page4.

接着,在步骤S5中,当接收到与待发布信息相关的查询序列时,计算机设备反馈与该待发布信息对应的该一个或多个网络发布信息。Next, in step S5, when receiving a query sequence related to the information to be released, the computer device feeds back the one or more network release information corresponding to the information to be released.

继续对前述第二示例进行说明,当计算机设备接收到用户提交的包含待发布信息Info2的内容的查询序列时,计算机设备将该在步骤S4中生成的超链接信息Link1提供给用户。Continuing to describe the second example above, when the computer device receives the query sequence submitted by the user including the content of the information Info2 to be published, the computer device provides the hyperlink information Link1 generated in step S4 to the user.

根据本发明的方法,计算机设备可自动确定与待发布信息相匹配的网页,无需人工进行设置,从而提高了操作效率,并且,当网页内容发生变换后,计算机设备还可根据变换后的网页来执行本发明的方案,以重新确定与待发布信息对应的目标网页,因此根据本发明的方案,可自动、高效的为待发布信息确定与其匹配度较高的目标网页,提高了实现效率。According to the method of the present invention, the computer equipment can automatically determine the webpage that matches the information to be released, without manual setting, thereby improving the operation efficiency, and when the content of the webpage is changed, the computer equipment can also be based on the transformed webpage. The scheme of the present invention is executed to redetermine the target webpage corresponding to the information to be released. Therefore, according to the scheme of the present invention, the target webpage with a high matching degree can be automatically and efficiently determined for the information to be released, which improves the implementation efficiency.

图2示意出了根据本发明的一个方面的一种用于确定目标网页的网页确定装置的结构示意图。根据本发明的方法包括第一获取装置1、第二获取装置2和选择装置3。Fig. 2 shows a schematic structural diagram of a webpage determining device for determining a target webpage according to an aspect of the present invention. The method according to the invention comprises first acquisition means 1 , second acquisition means 2 and selection means 3 .

其中,根据本发明的方案通过计算机设备实现。所述计算机设备包括一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的电子设备,其硬件包括但不限于微处理器、专用集成电路(ASIC)、可编程门阵列(FPGA)、数字处理器(DSP)、嵌入式设备等。其中,所述计算机设备包括网络设备和用户设备。所述网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算(Cloud Computing)的由大量主机或网络服务器构成的云,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。所述用户设备包括但不限于任何一种可与用户通过键盘、鼠标、遥控器、触摸板、或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、PDA、游戏机、或IPTV等。其中,所述用户设备及网络设备所处的网络包括但不限于互联网、广域网、城域网、局域网、VPN网络等。Wherein, the solution according to the present invention is realized by computer equipment. The computer equipment includes an electronic equipment that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but not limited to microprocessors, application-specific integrated circuits (ASICs), programmable gates Arrays (FPGA), digital processors (DSP), embedded devices, etc. Wherein, the computer equipment includes network equipment and user equipment. The network equipment includes but is not limited to a single network server, a server group composed of multiple network servers, or a cloud based on cloud computing (Cloud Computing) composed of a large number of hosts or network servers, wherein cloud computing is a kind of distributed computing , a super virtual computer consisting of a group of loosely coupled computer sets. The user equipment includes but is not limited to any electronic product that can interact with the user through a keyboard, mouse, remote control, touch pad, or voice-activated device, such as a personal computer, tablet computer, smart phone, PDA , game console, or IPTV, etc. Wherein, the network where the user equipment and the network equipment are located includes but is not limited to the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.

需要说明的是,所述用户设备、网络设备以及网络仅为举例,其他现有的或今后可能出现的用户设备、网络设备以及网络如可适用于本发明,也应包含在本发明保护范围以内,并以引用方式包含于此。It should be noted that the user equipment, network equipment, and network described above are only examples, and other existing or future user equipment, network equipment, and networks that are applicable to the present invention should also be included within the protection scope of the present invention , and is incorporated herein by reference.

具体地,参照图2,第一获取装置1获取待发布信息及其对应的多个候选网页。Specifically, referring to FIG. 2 , the first acquiring device 1 acquires information to be published and a plurality of corresponding candidate webpages.

其中,所述待发布信息包括用户希望发布至互联网的各类信息,包括但不限于文本、网页、多媒体等。Wherein, the information to be published includes various types of information that the user wishes to publish on the Internet, including but not limited to text, webpage, multimedia and so on.

其中,所述第一获取装置1可直接根据用户输入来确定待发布信息,或者,根据一个或多个其他用户所使用的待发布信息来确定待发布信息。例如,第一获取装置1获取被其他用户使用最多的前n个待发布信息,作为自身的待发布信息等。Wherein, the first acquiring device 1 may determine the information to be released directly according to user input, or determine the information to be released according to the information to be released used by one or more other users. For example, the first obtaining means 1 obtains the top n pieces of information to be released that are most used by other users as its own information to be released.

具体地,第一获取装置1根据预定网页范围来确定与待发布信息对应的候选网页。其中,所述预定网页范围包括但不限于预设的至少一个网页的网页链接信息。Specifically, the first acquiring device 1 determines a candidate web page corresponding to the information to be published according to a predetermined range of web pages. Wherein, the predetermined range of webpages includes but not limited to preset webpage link information of at least one webpage.

优选地,所述预定网页范围包括所述与预设的网页链接信息的所属网站所包含的所有网页。Preferably, the predetermined range of webpages includes all webpages included in the website to which the preset webpage link information belongs.

例如,预设的网页链接信息包括:http://www.abc.com/page1.html,则预定网页范围包括与域名“http://www.abc.com/”对应的网站所包含的全部网页。For example, the preset web page link information includes: http://www.abc.com/page1.html, then the predetermined web page range includes all the information contained in the website corresponding to the domain name "http://www.abc.com/". Web page.

接着,第二获取装置2获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度。Next, the second obtaining means 2 obtains the matching degree between the information to be published and each candidate web page among the plurality of candidate web pages.

具体地,第二获取装置2根据待发布信息与所述各个候选网页的网页内容间的相似度,和/或各个候选网页分别在各自所属网站的站内搜索结果中的排名信息,来确定待发布信息与该多个候选网页中的各个候选网页之间的匹配度。Specifically, the second obtaining means 2 determines the information to be published according to the similarity between the information to be published and the webpage content of each candidate webpage, and/or the ranking information of each candidate webpage in the search results of the respective websites in the site. The matching degree between the information and each candidate webpage in the plurality of candidate webpages.

其中,所述第二获取装置2获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度的方式包括但不限于以下任一种:Wherein, the way for the second obtaining means 2 to obtain the matching degree between the information to be released and each candidate web page in the plurality of candidate web pages includes but is not limited to any of the following:

1)第二获取装置2中的第一子获取装置(图未示)获取所述待发布信息与所述各个候选网页的网页内容间的相似度;接着,第二获取装置2中的第一确定装置(图未示)根据所述相似度来确定该待发布信息与各个候选网页的匹配度。1) The first sub-acquiring device (not shown) in the second obtaining device 2 obtains the similarity between the information to be published and the web content of each candidate web page; then, the first sub-acquiring device in the second obtaining device 2 The determining means (not shown in the figure) determines the matching degree between the information to be released and each candidate webpage according to the similarity.

其中,所述网页内容包括但不限于以下至少任一项:Wherein, the webpage content includes but is not limited to at least any of the following:

a)锚文本信息;a) anchor text information;

b)网页主题信息;b) Web page theme information;

c)网页正文信息。c) Web page text information.

具体地,第一子获取装置获取所述待发布信息与所述各个候选网页的网页内容间的相似度的方式包括但不限于以下任一种:Specifically, the way for the first sub-obtaining means to acquire the similarity between the information to be published and the webpage content of each candidate webpage includes but is not limited to any of the following:

a)第一子获取装置利用文本挖掘技术来分析待发布信息与各个候选网页的网页内容间的相似度。a) The first sub-acquisition means uses text mining technology to analyze the similarity between the information to be published and the web content of each candidate web page.

例如,第一子获取装置计算待发布信息相对于候选网页的网页内容信息的IF-IDF值,并将其作为相似度信息等。For example, the first sub-acquisition means calculates the IF-IDF value of the information to be released relative to the webpage content information of the candidate webpage, and uses it as similarity information.

b)第一子获取装置中的计算装置(图未示)计算所述待发布信息与所述候选网页的网页内容之间的编辑距离信息,以根据所述编辑距离信息来确定相似度信息。b) The computing device (not shown) in the first sub-acquisition device calculates the edit distance information between the information to be published and the web content of the candidate web page, so as to determine the similarity information according to the edit distance information.

优选地,计算装置根据所述待发布信息与候选网页中的一个或多个锚文本信息之间的编辑距离来确定相似度。Preferably, the computing device determines the similarity according to the edit distance between the information to be published and one or more anchor text information in the candidate webpage.

更优选地,计算装置获取待发布信息相对于候选网页的多个锚文本之间的编辑距离,并基于该多个编辑距离来确定该待发布信息相对于候选网页的相似度。More preferably, the computing device acquires edit distances between multiple anchor texts of the information to be released relative to the candidate webpages, and determines the similarity of the information to be released relative to the candidate webpages based on the multiple edit distances.

例如,计算装置先根据多个编辑距离获取待发布信息相对于候选网页的网页内容的平均编辑距离,并基于预定的编辑距离与匹配度的转换规则,确定相应的匹配度等级。其中所述转换规则可以为预定的对应关系,也可以为预定的转换函数。For example, the computing device first obtains the average edit distance of the information to be published relative to the content of the candidate webpage according to multiple edit distances, and determines the corresponding match degree level based on a predetermined conversion rule between edit distance and match degree. The conversion rule may be a predetermined correspondence, or a predetermined conversion function.

2)当所述多个候选网页中的部分或全部候选网页属于同一网站时,第二获取装置2中的第二子获取装置(图未示)通过与该网站通信,以获取该网站基于所述待发布信息在其网站内进行查询的查询结果,其中包含所述部分或全部候选网页的排序信息;接着,第二获取装置2中的第二确定装置(图未示)根据所述查询结果中所包含的所述部分或全部候选网页排序信息,确定所述待发布信息与该部分或全部候选网页中的各个候选网页之间的匹配度。2) When some or all of the candidate webpages in the plurality of candidate webpages belong to the same website, the second sub-acquiring device (not shown) in the second obtaining device 2 communicates with the website to obtain the website based on the The query result of querying the information to be published in its website, which includes the ranking information of the part or all of the candidate web pages; then, the second determining device (not shown) in the second obtaining device 2 according to the query result The ranking information of the part or all of the candidate webpages contained in , and determine the matching degree between the information to be released and each candidate webpage in the part or all of the candidate webpages.

根据本发明的第一示例,待发布信息Info1对应的3个候选网页page1、page2以及page3均包含于网站Site1中,则第二子获取装置与该Site1进行通信,以获得网站Site1基于Info1在网站内部进行搜索的搜索结果,其中候选网页page1、page2以及page3在搜索结果中的排名分别为第2位、第5位以及第10位,则第二确定装置根据该3个候选网页各自的搜索结果排名,确定该3个候选网页与待发布信息Info1之间的匹配度排名从高到低依次为候选网页page3、page2和page1。According to the first example of the present invention, the three candidate webpages page1, page2, and page3 corresponding to the information to be published are all included in the website Site1, and then the second sub-acquisition device communicates with the Site1 to obtain the information on the website of the website Site1 based on Info1. The search results of the internal search, wherein the rankings of the candidate web pages page1, page2 and page3 in the search results are respectively the 2nd, the 5th and the 10th, then the second determination means Ranking, to determine the matching degree rankings between the three candidate web pages and the information to be released Info1 is the candidate web pages page3, page2 and page1 in order from high to low.

优选地,计算机设备可根据待发布信息与所述各个候选网页的网页内容间的相似度,以及各个候选网页分别在各自所属网站的站内搜索结果中的排名信息,确定待发布信息与该各个候选网页的匹配度信息。Preferably, the computer device can determine the information to be released and each candidate webpage according to the similarity between the information to be released and the webpage content of each candidate webpage, and the ranking information of each candidate webpage in the search results of the respective websites. Matching information for web pages.

根据本发明的第二示例,待发布信息Info2对应的5个候选网页中,page4至page6属于网站Site2,page7和page8属于网站Site3。计算机设备根据Info3相对于该5个候选网页的编辑距离依次分别为1、2、1、2、4;并且,计算机设备通过与网站Site2的通信,获得网站Site2基于Info2进行站内搜索后共获得50项结果,其中,page4、page5以及page6分别位于第3位、第4位以及第9位;计算机设备通过与网站Site3的通信,获得网站Site3基于Info2进行站内搜索后共获得25项结果,其中,page7、page8分别位于第1位和第5位。According to the second example of the present invention, among the five candidate webpages corresponding to the information to be released Info2, page4 to page6 belong to the website Site2, and page7 and page8 belong to the website Site3. According to Info3, the editing distances of the computer equipment relative to the five candidate webpages are respectively 1, 2, 1, 2, and 4; and, through communication with the website Site2, the computer equipment obtains a total of 50 webpages based on the site search of the website Site2. item results, among which, page4, page5, and page6 are located at the 3rd, 4th, and 9th positions respectively; through the communication between the computer equipment and the website Site3, a total of 25 items of results are obtained after the website Site3 conducts an in-site search based on Info2, among which, page7 and page8 are located at the 1st and 5th positions respectively.

接着,计算机设备根据预定的匹配度计算公式来确定待发布信息相对于各个候选网页的匹配度,该公式如下所示:Next, the computer device determines the matching degree of the information to be released relative to each candidate webpage according to a predetermined matching degree calculation formula, the formula is as follows:

匹配度=(1/编辑距离)×(1-搜索结果排名/搜索结果总项数);Matching degree=(1/editing distance)×(1-search result ranking/total number of search result items);

则计算机设备根据该确定Info2相对于候选网页page4至page8的匹配度分别为:0.94、0.46、0.82、0.48、0.2。Then, according to the determination of the computer device, the matching degrees of Info2 relative to the candidate web pages page4 to page8 are respectively: 0.94, 0.46, 0.82, 0.48, and 0.2.

需要说明的是,本说明书中所用到的公式、数值等仅是用于供理解本发明而作的举例,不作为实际应用时的真实数据或公式,也不应当理解为对于本发明的限制。本领域技术人员根据本发明所公开的原理,采用其他公式或数值来确定匹配度的方式,也应当包含于本发明的保护范围内。如无特别说明,本文中其他地方出现的字符串的功用与此处相同,为简明起见,不再赘述。It should be noted that the formulas, numerical values, etc. used in this specification are only examples for understanding the present invention, not as real data or formulas in actual application, and should not be construed as limiting the present invention. According to the principles disclosed in the present invention, those skilled in the art may use other formulas or numerical values to determine the matching degree, which should also be included in the protection scope of the present invention. Unless otherwise specified, the functions of the character strings appearing elsewhere in this article are the same as those here, and will not be repeated for brevity.

需要进一步说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度的实现方式,均应包含在本发明的范围内。It should be further explained that the above examples are only to better illustrate the technical solution of the present invention, rather than limit the present invention. Those skilled in the art should understand that any information to be released and each of the plurality of candidate web pages The implementation of the matching degree between candidate webpages should be included in the scope of the present invention.

接着,选择装置3根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。Next, the selecting device 3 selects one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages.

具体地,选择装置3基于预定选择规则,根据所述待发布信息与所述各个候选网页之间的匹配度来选择一个或多个候选网页作为目标网页。Specifically, the selecting means 3 selects one or more candidate webpages as the target webpage according to the matching degree between the information to be published and the respective candidate webpages based on predetermined selection rules.

例如,预定选择规则包括选择匹配度排名前m个的候选网页作为目标网页,则计算机设备将各个候选网页按照匹配度信息排名后选取前m个作为目标网页。For example, the predetermined selection rule includes selecting the top m candidate webpages ranked in the matching degree as the target webpage, and then the computer device ranks each candidate webpage according to the matching degree information and selects the top m candidate webpages as the target webpage.

优选地,预定选择规则包括选择匹配度大于预定阈值的候选网页作为目标网页,则选择装置3中的判断装置(图未示)对各个候选网页,判断该候选网页与所述待发布信息之间的匹配度是否达到预定阈值;并当达到预定阈值时,选择装置3中的第一子选择装置(图未示)将该候选网页作为所述待发布信息的目标网页。Preferably, the predetermined selection rule includes selecting a candidate webpage whose matching degree is greater than a predetermined threshold as the target webpage, and then the judging device (not shown) in the selection device 3 judges the relationship between the candidate webpage and the information to be released for each candidate webpage. Whether the degree of matching of the matching degree reaches a predetermined threshold; and when the predetermined threshold is reached, the first sub-selection device (not shown) in the selection device 3 takes the candidate web page as the target web page of the information to be released.

更优选地,当没有候选网页达到所述预定阈值时,选择装置3中的第二子选择装置(图未示)将预定网页作为所述待发布信息的目标网页。More preferably, when no candidate webpage reaches the predetermined threshold, the second sub-selection means (not shown) in the selection means 3 takes a predetermined webpage as the target webpage of the information to be published.

其中,所述预定网页包括但不限于用户预设的网页。Wherein, the predetermined webpage includes but not limited to a webpage preset by the user.

需要说明的是,上述举例仅为更好地说明本发明的技术方案,而非对本发明的限制,本领域技术人员应该理解,任何计算机设备根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页的实现方式,均应包含在本发明的范围内。It should be noted that the above examples are only to better illustrate the technical solutions of the present invention, and are not intended to limit the present invention. Those skilled in the art should understand that any computer device can The matching degree and the implementation of selecting one or more candidate webpages as the target webpage corresponding to the information to be published shall be included in the scope of the present invention.

作为本发明的优选实施例之一,其中,所述网页确定装置还包括生成装置(图未示)和反馈装置(图未示)。As one of the preferred embodiments of the present invention, wherein, the means for determining the webpage further includes generating means (not shown in the figure) and feedback means (not shown in the figure).

生成装置根据所述待发布信息及其对应的至少一个目标网页,生成一个或多个网络发布信息。The generating means generates one or more web publishing information according to the information to be published and at least one corresponding target web page.

其中,各个网络发布信息分别包含其所对应的目标网页的指示信息。所述指示信息包括但不限于指向所述目标网页的网页链接信息。Wherein, each network publishing information includes indication information of its corresponding target web page. The indication information includes but not limited to web page link information pointing to the target web page.

优选地,网络发布信息包括但不限于可在网页中以不同形式呈现的各类信息。Preferably, the information published on the network includes but is not limited to various types of information that can be presented in different forms on the webpage.

例如,当待发布信息包括关键词“苹果”时,网络发布信息可包括嵌入在网页中的包含该“评估”关键词的URL,或者,网络发布信息可包括嵌入在网页中的以“苹果”为主题的图片等。For example, when the information to be released includes the keyword "apple", the web publishing information may include a URL embedded in the web page containing the "evaluation" keyword, or the web publishing information may include the URL embedded in the web page starting with "apple". themed pictures etc.

继续对前述第二示例进行说明,其中,生成装置根据获取最匹配的网页的预定选择规则,确定与待发布信息Info2对应的目标网页为page4,则生成装置生成以Info2的内容为锚文本,并指向page4的超链接信息Link1。Continue to explain the second example above, wherein, according to the predetermined selection rule for obtaining the most matching web page, the generating device determines that the target web page corresponding to the information to be released Info2 is page4, then the generating device generates the anchor text with the content of Info2, and Hyperlink information Link1 pointing to page4.

接着,当接收到与待发布信息相关的查询序列时,反馈装置反馈与该待发布信息对应的该一个或多个网络发布信息。Next, when receiving a query sequence related to the information to be released, the feedback device feeds back the one or more network release information corresponding to the information to be released.

继续对前述第二示例进行说明,当反馈装置接收到用户提交的包含待发布信息Info2的内容的查询序列时,反馈装置将生成装置生成的超链接信息Link1提供给用户。Continuing with the description of the second example above, when the feedback device receives a query sequence submitted by the user including the content of the information Info2 to be released, the feedback device provides the hyperlink information Link1 generated by the generating device to the user.

根据本发明的方案,计算机设备可自动确定与待发布信息相匹配的网页,无需人工进行设置,从而提高了操作效率,并且,当网页内容发生变换后,计算机设备还可根据变换后的网页来执行本发明的方案,以重新确定与待发布信息对应的目标网页,因此根据本发明的方案,可自动、高效的为待发布信息确定与其匹配度较高的目标网页,提高了实现效率。According to the solution of the present invention, the computer equipment can automatically determine the webpage that matches the information to be published, without manual setting, thereby improving the operation efficiency, and when the content of the webpage is changed, the computer equipment can also be based on the transformed webpage. The scheme of the present invention is executed to redetermine the target webpage corresponding to the information to be released. Therefore, according to the scheme of the present invention, the target webpage with a high matching degree can be automatically and efficiently determined for the information to be released, which improves the implementation efficiency.

根据本实施例的方案,当用户与多个账户相对应时,网络设备可自动分析其所对应的账户的重要性,并根据主要的账户的相关信息来触发对该用户的后续管理操作,而无需对重要性较低的一些次要账户进行管理,从而在实现对用户的有效管理的前提下,减轻了网络设备自身的负担。According to the solution of this embodiment, when a user corresponds to multiple accounts, the network device can automatically analyze the importance of the corresponding account, and trigger subsequent management operations on the user according to the relevant information of the main account, while There is no need to manage some secondary accounts of low importance, thereby reducing the burden of the network device itself on the premise of realizing effective management of users.

本发明的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本发明的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本发明的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个功能或步骤的电路。The software program of the present invention can be executed by a processor to realize the steps or functions described above. Likewise, the software program (including associated data structures) of the present invention can be stored in a computer-readable recording medium such as RAM memory, magnetic or optical drive or floppy disk and the like. In addition, some steps or functions of the present invention may be implemented by hardware, for example, as a circuit that cooperates with a processor to execute each function or step.

另外,本发明的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本发明的方法和/或技术方案。而调用本发明的方法的程序指令,可能被存储在固定的或可移动的记录介质中,和/或通过广播或其他信号承载媒体中的数据流而被传输,和/或被存储在根据所述程序指令运行的计算机设备的工作存储器中。在此,根据本发明的一个实施例包括一个装置,该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发该装置运行基于前述根据本发明的多个实施例的方法和/或技术方案。In addition, a part of the present invention can be applied as a computer program product, such as a computer program instruction. When it is executed by a computer, the method and/or technical solution according to the present invention can be invoked or provided through the operation of the computer. The program instructions for invoking the method of the present invention may be stored in a fixed or removable recording medium, and/or transmitted through broadcasting or data streams in other signal-carrying media, and/or stored in the in the working memory of the computer device on which the program instructions described above are executed. Here, an embodiment according to the present invention comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein when the computer program instructions are executed by the processor, a trigger The operation of the device is based on the foregoing methods and/or technical solutions according to multiple embodiments of the present invention.

对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。It will be apparent to those skilled in the art that the invention is not limited to the details of the above-described exemplary embodiments, but that the invention can be embodied in other specific forms without departing from the spirit or essential characteristics of the invention. Accordingly, the embodiments should be regarded in all points of view as exemplary and not restrictive, the scope of the invention being defined by the appended claims rather than the foregoing description, and it is therefore intended that the scope of the invention be defined by the appended claims rather than by the foregoing description. All changes within the meaning and range of equivalents of the elements are embraced in the present invention. Any reference sign in a claim should not be construed as limiting the claim concerned. In addition, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or devices stated in the system claims may also be realized by one unit or device through software or hardware. The words first, second, etc. are used to denote names and do not imply any particular order.

Claims (17)

1.一种计算机设备实现的用于确定目标网页的方法,其中,所述方法包括以下步骤:1. A method for determining a target webpage implemented by a computer device, wherein the method comprises the following steps: a获取待发布信息及其对应的多个候选网页;a Obtaining information to be released and multiple candidate webpages corresponding thereto; b获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度;b obtaining the matching degree between the information to be published and each candidate webpage in the plurality of candidate webpages; c根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。c. Selecting one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages. 2.根据权利要求1所述的方法,其中,所述步骤b包括以下步骤:2. The method according to claim 1, wherein said step b comprises the steps of: b1获取所述待发布信息与所述各个候选网页的网页内容间的相似度;b1 obtains the similarity between the information to be released and the webpage content of each candidate webpage; b2根据所述相似度来确定该待发布信息与各个候选网页的匹配度。b2 determines the matching degree between the information to be released and each candidate webpage according to the similarity. 3.根据权利要求2所述的方法,其中,所述步骤b1包括以下步骤:3. The method according to claim 2, wherein said step b1 comprises the following steps: -计算所述待发布信息与所述候选网页的网页内容之间的编辑距离信息,以根据所述编辑距离信息来确定相似度信息。- calculating edit distance information between the information to be published and the webpage content of the candidate webpage, so as to determine similarity information according to the edit distance information. 4.根据权利要求1至3中任一项所述的方法,其中,所述多个候选网页中的部分或全部候选网页属于同一网站,其中,所述步骤b包括以下步骤:4. The method according to any one of claims 1 to 3, wherein some or all of the candidate web pages in the plurality of candidate web pages belong to the same website, wherein the step b comprises the following steps: -通过与该网站通信,以获取该网站基于所述待发布信息在其网站内进行查询的查询结果,其中包含所述部分或全部候选网页的排序信息;- By communicating with the website, to obtain the query result of the website's query on its website based on the information to be published, which includes the ranking information of some or all of the candidate web pages; -根据所述查询结果中所包含的所述部分或全部候选网页排序信息,确定所述待发布信息与该部分或全部候选网页中的各个候选网页之间的匹配度。- Determine the degree of matching between the information to be released and each candidate webpage in the part or all of the candidate webpages according to the ranking information of the part or all of the candidate webpages contained in the query result. 5.根据权利要求2至4中任一项所述的方法,其中,所述网页内容包括以下至少任一项:5. The method according to any one of claims 2 to 4, wherein the webpage content includes at least any one of the following: -锚文本信息;- anchor text information; -网页主题信息;- Web page theme information; -网页正文信息。-Web page body information. 6.根据权利要求1至5中任一项所述的方法,其中,所述步骤c包括以下步骤:6. The method according to any one of claims 1 to 5, wherein said step c comprises the steps of: -对各个候选网页,判断该候选网页与所述待发布信息之间的匹配度是否达到预定阈值;- For each candidate webpage, judging whether the matching degree between the candidate webpage and the information to be released reaches a predetermined threshold; -当达到预定阈值时,将该候选网页作为所述待发布信息的目标网页。- when the predetermined threshold is reached, the candidate webpage is used as the target webpage of the information to be released. 7.根据权利要求6所述的方法,其中,所述步骤c还包括以下步骤;7. The method according to claim 6, wherein said step c further comprises the following steps; -当没有候选网页达到所述预定阈值时,将预定网页作为所述待发布信息的目标网页。- when no candidate webpage reaches the predetermined threshold, taking a predetermined webpage as the target webpage of the information to be published. 8.根据权利要求1至7中任一项所述的方法,其中,所述方法还包括以下步骤:8. The method according to any one of claims 1 to 7, wherein said method further comprises the steps of: -根据所述待发布信息及其对应的至少一个目标网页,生成一个或多个网络发布信息,其中,各个网络发布信息分别包含其所对应的目标网页的指示信息;- generating one or more web publishing information according to the information to be published and at least one corresponding target webpage, wherein each web publishing information respectively includes indication information of its corresponding target webpage; 其中,所述方法还包括以下步骤:Wherein, described method also comprises the following steps: -当接收到与待发布信息相关的查询序列时,反馈与该待发布信息对应的该一个或多个网络发布信息。- When receiving a query sequence related to the information to be released, feeding back the one or more network release information corresponding to the information to be released. 9.一种用于确定目标网页的网页确定装置,其中,所述网页确定装置包括:9. A web page determining device for determining a target web page, wherein the web page determining device comprises: 第一获取装置,用于获取待发布信息及其对应的多个候选网页;a first acquiring device, configured to acquire information to be released and a plurality of corresponding candidate webpages; 第二获取装置,用于获取所述待发布信息与该多个候选网页中的各个候选网页之间的匹配度;The second obtaining means is used to obtain the matching degree between the information to be published and each candidate webpage among the plurality of candidate webpages; 选择装置,用于根据所述待发布信息与所述各个候选网页之间的匹配度,选择一个或多个候选网页作为与所述待发布信息对应的目标网页。The selection means is used for selecting one or more candidate webpages as target webpages corresponding to the information to be released according to the matching degree between the information to be released and each of the candidate webpages. 10.根据权利要求9所述的网页确定装置,其中,所述第二获取装置包括:10. The device for determining a webpage according to claim 9, wherein the second obtaining means comprises: 第一子获取装置,用于获取所述待发布信息与所述各个候选网页的网页内容间的相似度;The first sub-acquisition means is used to obtain the similarity between the information to be released and the webpage content of each candidate webpage; 第一确定装置,用于根据所述相似度来确定该待发布信息与各个候选网页的匹配度。The first determining means is used for determining the matching degree between the information to be released and each candidate webpage according to the similarity. 11.根据权利要求10所述的网页确定装置,其中,所述第一子获取装置包括:11. The web page determining device according to claim 10, wherein said first sub-acquisition means comprises: 计算装置,用于计算所述待发布信息与所述候选网页的网页内容之间的编辑距离信息,以根据所述编辑距离信息来确定相似度信息。A computing device, configured to calculate edit distance information between the information to be published and the webpage content of the candidate webpage, so as to determine similarity information according to the edit distance information. 12.根据权利要求9至11中任一项所述的网页确定装置,其中,所述多个候选网页中的部分或全部候选网页属于同一网站,其中,所述第二获取装置包括:12. The webpage determining device according to any one of claims 9 to 11, wherein some or all of the candidate webpages in the plurality of candidate webpages belong to the same website, wherein the second obtaining means comprises: 第二子获取装置,用于通过与该网站通信,以获取该网站基于所述待发布信息在其网站内进行查询的查询结果,其中包含所述部分或全部候选网页的排序信息;The second sub-acquisition means is used to communicate with the website to obtain the query result of the website's query on its website based on the information to be released, which includes the ranking information of some or all of the candidate webpages; 第二确定装置,用于根据所述查询结果中所包含的所述部分或全部候选网页排序信息,确定所述待发布信息与该部分或全部候选网页中的各个候选网页之间的匹配度。The second determining means is configured to determine the degree of matching between the information to be published and each candidate webpage in the part or all of the candidate webpages according to the ranking information of the part or all of the candidate webpages contained in the query result. 13.根据权利要求10至12中任一项所述的网页确定装置,其中,所述网页内容包括以下至少任一项:13. The webpage determination device according to any one of claims 10 to 12, wherein the webpage content includes at least any one of the following: -锚文本信息;- anchor text information; -网页主题信息;- Web page theme information; -网页正文信息。-Web page body information. 14.根据权利要求9至13中任一项所述的网页确定装置,其中,所述选择装置包括:14. The device for determining a web page according to any one of claims 9 to 13, wherein the selection device comprises: 判断装置,用于对各个候选网页,判断该候选网页与所述待发布信息之间的匹配度是否达到预定阈值;judging means, for each candidate webpage, judging whether the matching degree between the candidate webpage and the information to be released reaches a predetermined threshold; 第一子选择装置,用于当达到预定阈值时,将该候选网页作为所述待发布信息的目标网页。The first sub-selection means is configured to use the candidate webpage as the target webpage of the information to be published when the predetermined threshold is reached. 15.根据权利要求14所述的网页确定装置,其中,所述选择装置还包括:15. The web page determining device according to claim 14, wherein said selecting device further comprises: 第二子选择装置,用于当没有候选网页达到所述预定阈值时,将预定网页作为所述待发布信息的目标网页。The second sub-selection means is configured to use a predetermined webpage as the target webpage of the information to be released when no candidate webpage reaches the predetermined threshold. 16.根据权利要求9至15中任一项所述的网页确定装置,其中,所述网页确定装置还包括:16. The web page determining device according to any one of claims 9 to 15, wherein the web page determining device further comprises: 生成装置,用于根据所述待发布信息及其对应的至少一个目标网页,生成一个或多个网络发布信息,其中,各个网络发布信息分别包含其所对应的目标网页的指示信息;A generating device, configured to generate one or more web publishing information based on the information to be published and at least one corresponding target web page, wherein each web publishing information includes indication information of its corresponding target web page; 反馈装置,用于当接收到与待发布信息相关的查询序列时,反馈与该待发布信息对应的该一个或多个网络发布信息。The feedback means is used for feeding back the one or more network release information corresponding to the information to be released when receiving the query sequence related to the information to be released. 17.一种计算机设备,其中,所述计算机设备包括根据权利要求9至权利要求16中至少任一项所述的网页确定装置。17. A computer device, wherein the computer device comprises the apparatus for determining a webpage according to at least any one of claims 9 to 16.
CN201310092363.8A 2013-03-21 2013-03-21 Method, device and equipment for determining target webpage Active CN104063394B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310092363.8A CN104063394B (en) 2013-03-21 2013-03-21 Method, device and equipment for determining target webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310092363.8A CN104063394B (en) 2013-03-21 2013-03-21 Method, device and equipment for determining target webpage

Publications (2)

Publication Number Publication Date
CN104063394A true CN104063394A (en) 2014-09-24
CN104063394B CN104063394B (en) 2020-05-08

Family

ID=51551110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310092363.8A Active CN104063394B (en) 2013-03-21 2013-03-21 Method, device and equipment for determining target webpage

Country Status (1)

Country Link
CN (1) CN104063394B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN114077722A (en) * 2021-10-20 2022-02-22 深信服科技股份有限公司 Data leakage tracking method and device, electronic equipment and computer storage medium
CN114463730A (en) * 2021-07-15 2022-05-10 荣耀终端有限公司 A page identification method and terminal device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073667A1 (en) * 2004-06-01 2007-03-29 Chung Hyun J Search system and method using a plurality of searching criterion
CN101256596A (en) * 2008-03-28 2008-09-03 北京搜狗科技发展有限公司 Method and system for instation guidance
CN102789453A (en) * 2011-05-16 2012-11-21 阿里巴巴集团控股有限公司 Advertising information release method and device
CN102968413A (en) * 2011-08-31 2013-03-13 北京百度网讯科技有限公司 Method and equipment for providing searching result

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073667A1 (en) * 2004-06-01 2007-03-29 Chung Hyun J Search system and method using a plurality of searching criterion
CN101256596A (en) * 2008-03-28 2008-09-03 北京搜狗科技发展有限公司 Method and system for instation guidance
CN102789453A (en) * 2011-05-16 2012-11-21 阿里巴巴集团控股有限公司 Advertising information release method and device
CN102968413A (en) * 2011-08-31 2013-03-13 北京百度网讯科技有限公司 Method and equipment for providing searching result

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN104331449B (en) * 2014-10-29 2017-10-27 百度在线网络技术(北京)有限公司 Query statement and determination method, device, terminal and the server of webpage similarity
CN114463730A (en) * 2021-07-15 2022-05-10 荣耀终端有限公司 A page identification method and terminal device
CN114077722A (en) * 2021-10-20 2022-02-22 深信服科技股份有限公司 Data leakage tracking method and device, electronic equipment and computer storage medium

Also Published As

Publication number Publication date
CN104063394B (en) 2020-05-08

Similar Documents

Publication Publication Date Title
CN102625936B (en) Query suggestions from documentation
US10216851B1 (en) Selecting content using entity properties
CN106489146B (en) Query rewriting using session information
US8275771B1 (en) Non-text content item search
US9805102B1 (en) Content item selection based on presentation context
US12056197B2 (en) Identifying information using referenced text
CN112136127B (en) Action indicator for search operation output element
CN111159572B (en) Recommended content auditing method and device, electronic equipment and storage medium
US12130827B1 (en) Triggering knowledge panels
US10152521B2 (en) Resource recommendations for a displayed resource
US20130041898A1 (en) Image processing system, image processing method, program, and non-transitory information storage medium
US11055312B1 (en) Selecting content using entity properties
JP6363682B2 (en) Method for selecting an image that matches content based on the metadata of the image and content
US8819004B1 (en) Ranking image search results using hover data
CN103186574A (en) Method and device for generating searching result
CN103150663A (en) Method and device for placing network placement data
CN102999595B (en) A kind of for providing method and the equipment of the accession page corresponding with page info
CN105706081B (en) Structured Information Link Notes
CN113111216B (en) Advertisement recommendation method, device, equipment and storage medium
CN104063394B (en) Method, device and equipment for determining target webpage
JP6676699B2 (en) Information providing method and apparatus using degree of association between reserved word and attribute language
CN104077320B (en) A method and device for generating information to be released
CN102436511A (en) A method and device for obtaining guidance prompt information for network search
WO2017049767A1 (en) Method and apparatus for generating query result
JP2011060228A (en) Webpage correlation evaluation device for detecting information spreading

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant