CN104572956A - System and method for confirming POI information effectiveness - Google Patents
System and method for confirming POI information effectiveness Download PDFInfo
- Publication number
- CN104572956A CN104572956A CN201410849380.6A CN201410849380A CN104572956A CN 104572956 A CN104572956 A CN 104572956A CN 201410849380 A CN201410849380 A CN 201410849380A CN 104572956 A CN104572956 A CN 104572956A
- Authority
- CN
- China
- Prior art keywords
- poi
- poi information
- name
- network
- address data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明涉及一种基于网络中的地址数据确定POI信息有效性的系统及方法,该方法包括:利用网络中的地址数据获取对应相同POI名称的多个相关POI信息;统计所述POI信息在所述网络中的地址数据中的出现次数;根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。本发明通过关键词聚类把不同说法的同一个POI名称聚为一类,将同一个经纬度对应的多个POI名字利用互联网“投票”机制来选取最佳的POI名称,并利用互联上“投票”机制选取可信的POI名称,使得用户能够快速、准确地搜索到同POI地址对应的可信POI信息,改善用户体验。
The present invention relates to a system and method for determining the validity of POI information based on address data in the network. The method includes: using the address data in the network to obtain a plurality of related POI information corresponding to the same POI name; the number of occurrences in the address data in the network; and determine valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network. The present invention clusters the same POI names with different sayings into one category through keyword clustering, uses the Internet "voting" mechanism to select the best POI name for multiple POI names corresponding to the same longitude and latitude, and uses the "voting" on the Internet to select the best POI name. "The mechanism selects credible POI names, enabling users to quickly and accurately search for credible POI information corresponding to POI addresses, improving user experience.
Description
技术领域technical field
本发明涉及电子地图技术领域,具体而言,涉及一种基于网络中的地址数据确定POI信息有效性的系统和一种基于网络中的地址数据确定POI信息有效性的方法。The invention relates to the technical field of electronic maps, in particular to a system for determining the validity of POI information based on address data in a network and a method for determining the validity of POI information based on address data in a network.
背景技术Background technique
兴趣点(Point of Interest,POI)是电子地图中标注的地理信息点,通常包含POI标识、POI名称、POI类型、经度、纬度等信息。POI可以在地图上标注出来,带有经纬度信息,可以用来查找并计算导航的地标点或者建筑物,例如商场、停车场、学校、医院、酒店、饭店、超市、公园、旅游景点等。A point of interest (POI) is a geographical information point marked in an electronic map, and usually includes information such as POI identification, POI name, POI type, longitude, and latitude. POI can be marked on the map with latitude and longitude information, which can be used to find and calculate navigation landmarks or buildings, such as shopping malls, parking lots, schools, hospitals, hotels, restaurants, supermarkets, parks, tourist attractions, etc.
越来越多的用户在电子地图中查询POI,数据库中存储的POI数据为POI查询提供数据支撑。目前,对数据库中的POI数据进行更新主要通过进行数据实采,根据实采得到的数据对数据库中存储的POI数据进行更新,或是从互联网上的各个生活类信息网站上获取POI数据,只要获取的数据包括POI的名称和地址,即可将该条数据确定为一条POI数据。由于POI数据的获取及更新方式的特点,不可避免的导致互联网上存在着各种各样的POI数据。因此,从不同来源网站获取的POI数据中,有可能存在重复性数据,即多条POI数据实际描述的是同一POI,其实际的POI经度、纬度相同,但是POI名称和POI地址的描述方式却不同。重复性的POI数据导致用户无法快速、准确的搜索到同一POI地理位置(经纬度)的POI地址对应的POI名称,影响用户体验。More and more users are querying POIs in electronic maps, and the POI data stored in the database provides data support for POI queries. At present, updating the POI data in the database is mainly through actual data collection, updating the POI data stored in the database according to the data obtained by actual collection, or obtaining POI data from various life information websites on the Internet, as long as The acquired data includes the name and address of the POI, so the piece of data can be determined as a piece of POI data. Due to the characteristics of POI data acquisition and update methods, it is inevitable that there are various POI data on the Internet. Therefore, there may be repetitive data in the POI data obtained from different source websites, that is, multiple pieces of POI data actually describe the same POI, and their actual POI longitude and latitude are the same, but the POI name and POI address are described in different ways. different. Repeated POI data makes it impossible for users to quickly and accurately search for the POI name corresponding to the POI address of the same POI geographic location (longitude and latitude), which affects user experience.
发明内容Contents of the invention
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决或者减缓上述问题的基于网络中的地址数据确定POI信息有效性的系统和相应的基于网络中的地址数据确定POI信息有效性的方法。In view of the above problems, the present invention is proposed to provide a system for determining the validity of POI information based on address data in the network and correspondingly determining POI information based on address data in the network to overcome the above problems or at least partially solve or alleviate the above problems effective method.
根据本发明的一个方面,提供了一种基于网络中的地址数据确定POI信息有效性的系统,该系统包括:According to one aspect of the present invention, a system for determining the validity of POI information based on address data in a network is provided, the system comprising:
POI信息获取单元,用于基于搜索引擎利用网络中的地址数据获取对应相同POI名称的多个相关POI信息;A POI information obtaining unit is used to obtain a plurality of related POI information corresponding to the same POI name based on the address data in the network based on the search engine;
统计单元,用于统计所述POI信息在所述网络中的地址数据中的出现次数;a statistical unit, configured to count the number of occurrences of the POI information in the address data in the network;
POI信息确定单元,用于根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。A POI information determining unit, configured to determine valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network.
优选地,所述多个相关POI信息为对应POI至少一个预设属性的信息。Preferably, the plurality of related POI information is information corresponding to at least one preset attribute of POIs.
优选地,所述预设属性为经纬度、地址、建筑物名称或所囊括单位名称。Preferably, the preset attribute is latitude and longitude, address, building name or the name of the included unit.
优选地,所述统计单元进一步包括:Preferably, the statistical unit further includes:
POI信息来源获取模块,用于获取所述POI信息的来源;A POI information source acquisition module, configured to acquire the source of the POI information;
POI信息来源可靠性判断模块,用于判断所述来源是否属于可靠来源;POI information source reliability judging module, used to judge whether the source is a reliable source;
统计模块,用于在来源属于可靠来源的情况下统计所述POI信息在所述网络中的地址数据中的出现次数;否则不统计。A statistical module, configured to count the number of occurrences of the POI information in the address data in the network if the source is a reliable source; otherwise, do not count.
优选地,所述POI信息确定单元进一步包括:Preferably, the POI information determining unit further includes:
判断子单元,用于判断所述POI信息在所述网络中的地址数据中的出现次数是否高于预定阈值;a judging subunit, configured to judge whether the number of occurrences of the POI information in the address data in the network is higher than a predetermined threshold;
信息点信息确定子单元,用于在所述判断子单元判断为是的情况下,确定所获取的POI信息有效。The information point information determining subunit is configured to determine that the acquired POI information is valid if the judging subunit judges yes.
优选地,所述可靠来源为具有预定可信度的来源。Preferably, the reliable source is a source with a predetermined credibility.
优选地,所述来源为网站或者网页。Preferably, the source is a website or a webpage.
根据本发明的另一个方面,提供了一种基于网络中的地址数据确定POI信息有效性的方法,包括:According to another aspect of the present invention, a method for determining the validity of POI information based on address data in a network is provided, comprising:
利用网络中的地址数据获取对应相同POI名称的多个相关POI信息;Use the address data in the network to obtain multiple related POI information corresponding to the same POI name;
统计所述POI信息在所述网络中的地址数据中的出现次数;counting the number of occurrences of the POI information in the address data in the network;
根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。Valid POI information corresponding to the same POI name is determined according to the number of occurrences of the POI information in address data in the network.
优选地,所述多个相关POI信息为对应POI至少一个预设属性的信息。Preferably, the plurality of related POI information is information corresponding to at least one preset attribute of POIs.
优选地,所述预设属性为经纬度、地址、建筑物名称或所囊括单位名称。Preferably, the preset attribute is latitude and longitude, address, building name or the name of the included unit.
优选地,所述步骤:统计所述POI信息在所述网络中的地址数据中的出现次数,进一步包括:Preferably, the step: counting the number of occurrences of the POI information in the address data in the network, further comprising:
获取所述POI信息的来源;The source from which the POI information was obtained;
判断所述来源是否属于可靠来源,如果是,则统计所述POI信息在所述网络中的地址数据里的出现次数,否则不统计。Judging whether the source is a reliable source, if so, counting the number of occurrences of the POI information in the address data in the network, otherwise not counting.
优选地,所述步骤:根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息,进一步包括:Preferably, the step: determining valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network, further comprising:
判断所述POI信息在所述网络中的地址数据中的出现次数是否高于预定阈值;judging whether the number of occurrences of the POI information in the address data in the network is higher than a predetermined threshold;
如果是,则确定所述POI信息有效。If yes, it is determined that the POI information is valid.
优选地,所述可靠来源为具有预定可信度的来源。Preferably, the reliable source is a source with a predetermined credibility.
优选地,所述来源为网站或者网页。Preferably, the source is a website or a web page.
本发明的有益效果为:The beneficial effects of the present invention are:
本发明对利用网络中的地址数据获取对应相同POI名称的多个相关POI信息,根据POI信息在网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息,从而使得用户能够快速、准确地搜索到同一经、纬度的POI地址对应的一个或多个POI名称,然后利用网络投票机制从一个或多个POI名称按照信息来源以及其在互联网上出现的频次进行过滤,选出可信度高的POI名称作为当前POI地址对应的POI名称,提高POI信息的有效性。The present invention uses the address data in the network to obtain a plurality of related POI information corresponding to the same POI name, and determines the effective POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network, so that the user can Quickly and accurately search for one or more POI names corresponding to POI addresses of the same latitude and longitude, and then use the online voting mechanism to filter one or more POI names according to the source of information and the frequency of their appearance on the Internet. The POI name with high reliability is used as the POI name corresponding to the current POI address to improve the validity of POI information.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention, it can be implemented according to the contents of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and understandable , the specific embodiments of the present invention are enumerated below.
附图说明Description of drawings
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating a preferred embodiment and are not to be considered as limiting the invention. Also throughout the drawings, the same reference numerals are used to designate the same parts. In the attached picture:
图1示意性示出了本发明一个实施例的基于网络中的地址数据确定POI信息有效性的系统的框图;FIG. 1 schematically shows a block diagram of a system for determining the validity of POI information based on address data in a network according to an embodiment of the present invention;
图2示意性示出了本发明另一个实施例的基于网络中的地址数据确定POI信息有效性的系统中的统计单元的框图;Fig. 2 schematically shows a block diagram of a statistical unit in a system for determining validity of POI information based on address data in a network according to another embodiment of the present invention;
图3示意性示出了本发明另一个实施例的基于网络中的地址数据确定POI信息有效性的系统中的POI信息确定单元的框图;FIG. 3 schematically shows a block diagram of a POI information determination unit in a system for determining the validity of POI information based on address data in a network according to another embodiment of the present invention;
图4示意性示出了本发明一个实施例的基于网络中的地址数据确定POI信息有效性的方法的流程图;FIG. 4 schematically shows a flowchart of a method for determining the validity of POI information based on address data in a network according to an embodiment of the present invention;
图5示意性示出了本发明另一个实施例的基于网络中的地址数据确定POI信息有效性的方法的步骤S12的细分流程图;以及FIG. 5 schematically shows a subdivided flow chart of step S12 of the method for determining the validity of POI information based on address data in the network according to another embodiment of the present invention; and
图6示意性示出了本发明另一个实施例的基于网络中的地址数据确定POI信息有效性的方法的步骤S13的细分流程图。Fig. 6 schematically shows a subdivided flow chart of step S13 of the method for determining the validity of POI information based on address data in the network according to another embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。Those skilled in the art will understand that unless otherwise stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the word "comprising" used in the description of the present invention refers to the presence of said features, integers, steps, operations, elements and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and/or groups thereof.
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非被特定定义,否则不会用理想化或过于正式的含义来解释。Those skilled in the art can understand that, unless otherwise defined, all terms (including technical terms and scientific terms) used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this invention belongs. It should also be understood that terms, such as those defined in commonly used dictionaries, should be understood to have meanings consistent with the meanings in the context of the prior art, and will not be used in an idealized or overly formal sense unless specifically defined to explain.
图1示意性示出了本发明一个实施例的基于网络中的地址数据确定POI信息有效性的系统的框图。Fig. 1 schematically shows a block diagram of a system for determining validity of POI information based on address data in a network according to an embodiment of the present invention.
参照图1,本发明实施例的基于网络中的地址数据确定POI信息有效性的系统,包括:Referring to Fig. 1, the system for determining the validity of POI information based on the address data in the network of the embodiment of the present invention includes:
POI信息获取单元11,用于基于搜索引擎利用网络中的地址数据获取对应相同POI名称的多个相关POI信息;POI information obtaining unit 11, for utilizing the address data in the network to obtain multiple related POI information corresponding to the same POI name based on the search engine;
本发明实施例中,所述多个相关POI信息为对应POI至少一个预设属性的信息。进一步地,所述预设属性为经纬度、地址、建筑物名称或所囊括单位名称。In the embodiment of the present invention, the plurality of related POI information is information corresponding to at least one preset attribute of the POI. Further, the preset attribute is latitude and longitude, address, building name or included unit name.
统计单元12,用于统计所述POI信息在所述网络中的地址数据中的出现次数;A statistical unit 12, configured to count the number of occurrences of the POI information in the address data in the network;
POI信息确定单元13,用于根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。The POI information determining unit 13 is configured to determine valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network.
本发明实施例,基于搜索引擎从网络数据中抓取地址数据,所述地址数据包括名称字段和地址信息,基于搜索引擎从互联网上挖掘的地图地址数据,比如name:恒大地产集团昆明公司;address:昆明市盘龙区北辰财富中心A座写字楼14楼,其中”恒大地产集团昆明公司”为POI的名称,“昆明市盘龙区北辰财富中心A座写字楼14楼”为此POI的地址,通过对地址的经纬度解析可以获得此地址所在的经纬度信息,比如地址“昆明市盘龙区北辰财富中心A座写字楼14楼”经纬度解析得到的经纬度为:东经:102.733445北纬:25.08108。另外,需要统计POI信息在互联网上出现的次数以及记录来源。In the embodiment of the present invention, the address data is captured from the network data based on the search engine, and the address data includes the name field and address information, and the map address data mined from the Internet based on the search engine, such as name: Evergrande Real Estate Group Kunming Company; address: 14th Floor, Office Building, Building A, Beichen Fortune Center, Panlong District, Kunming City, where "Evergrande Real Estate Group Kunming Company" is the name of the POI, and "14th Floor, Office Building A, Beichen Fortune Center, Panlong District, Kunming City" is the address of the POI , The longitude and latitude information of the address can be obtained by analyzing the longitude and latitude of the address. For example, the longitude and latitude of the address "14th Floor, Office Building, Building A, Beichen Fortune Center, Panlong District, Kunming City" is: East Longitude: 102.733445 North Latitude: 25.08108. In addition, it is necessary to count the number of times POI information appears on the Internet and the source of the record.
所以,最终从互联网上挖掘的地址数据对应的不同信息来源的POI信息的格式如表1所示,具体如下:Therefore, the format of the POI information from different information sources corresponding to the address data mined from the Internet is shown in Table 1, specifically as follows:
表1不同信息来源的POI信息的格式表Table 1 Format table of POI information from different information sources
由表1可见,在同一地理位置(经纬度相同)从不同来源网站获取的POI数据中,有可能存在重复性数据,即同一个地址(经纬度)可能存在多个POI名字,如表1中同一个经纬度存在多个公司,其实际的POI经度、纬度相同,但是POI名称和POI地址的描述方式却不同;还可以看出,同一个poi名字可能多种不同的说法,比如“保山明志汽车销售有限公司”和“保山明志汽车销售服务有限公司”,重复性的POI数据导致用户无法快速、准确的搜索到同一POI地理位置(经纬度)的POI地址对应的POI名称。It can be seen from Table 1 that in the POI data obtained from different source websites in the same geographical location (same latitude and longitude), there may be repetitive data, that is, there may be multiple POI names at the same address (latitude and longitude), such as the same POI name in Table 1. There are multiple companies with the same longitude and latitude, and their actual POIs have the same longitude and latitude, but the POI name and POI address are described in different ways; it can also be seen that the same POI name may have many different sayings, such as "Baoshan Mingzhi Automobile Sales Co., Ltd. "Company" and "Baoshan Mingzhi Automobile Sales Service Co., Ltd.", the repetitive POI data makes it impossible for users to quickly and accurately search for the POI name corresponding to the POI address of the same POI geographic location (latitude and longitude).
本发明实施例中,基于搜索引擎利用网络中的地址数据获取对应相同POI名称的多个相关POI信息,其中,多个相关POI信息为对应POI至少一个预设属性的信息,所述预设属性为经纬度、地址、建筑物名称或所囊括单位名称,根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。In the embodiment of the present invention, multiple related POI information corresponding to the same POI name is obtained based on the search engine using address data in the network, wherein the multiple related POI information is information corresponding to at least one preset attribute of the POI, and the preset attribute For the latitude and longitude, address, building name or unit name included, determine valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network.
进一步地,步骤根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息,包括:根据相关POI信息的预设属性的信息将对应相同地址信息的名称字段按照关键词进行聚类,统计聚类后各类别中名称字段出现的频次,作为第二频次,根据所述第二频次确定该类别对应该地址信息的POI名称,根据所述第二频次确定该类别对应该地址信息的POI名称,利用互联网“投票”机制来选取相同POI名称的可信的、有效POI信息。Further, the step of determining valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network includes: information corresponding to the same address information according to the preset attributes of the relevant POI information The name field is clustered according to keywords, and the frequency of occurrence of the name field in each category is counted after clustering, as the second frequency, and the POI name of the category corresponding to the address information is determined according to the second frequency, and according to the second frequency Frequency determines the POI name corresponding to the address information in this category, and uses the Internet "voting" mechanism to select credible and valid POI information with the same POI name.
更进一步地,基于所述名称字段确定一个或多个关键词,将对应相同地址信息的所述关键词进行聚类,根据聚类后的关键词确定聚类后的名称字段。Furthermore, one or more keywords are determined based on the name fields, the keywords corresponding to the same address information are clustered, and the clustered name fields are determined according to the clustered keywords.
更进一步地,对所述名称字段中的名称进行切词处理生成分词,根据所述分词获取所述名称字段的关键词。Furthermore, word segmentation is performed on the name in the name field to generate word segmentation, and keywords in the name field are obtained according to the word segmentation.
更进一步地,统计对应相同地址信息的每个分词出现的频次,作为第一频次,根据所述第一频次生成所述名称字段的关键词,具体为,选择所述第一频次最小并且是非地名的分词作为所述名称字段的关键词。Furthermore, the frequency of occurrence of each participle corresponding to the same address information is counted as the first frequency, and the keywords of the name field are generated according to the first frequency. The participle of is used as the keyword of the name field.
更进一步地,本发明可以将所述各个类中所述第二频次最高的名称字段作为类标识名称,将每类标识名称均作为对应该地址信息的POI名称;或,将所述各个类中第二频次最高的名称字段作为类标识名称,将网络上出现次数最多的类标识名称作为对应该地址信息的POI名称。Furthermore, in the present invention, the name field with the second highest frequency in each class can be used as the class identification name, and each class identification name can be used as the POI name corresponding to the address information; or, the The name field with the second highest frequency is used as the class identification name, and the class identification name that appears most frequently on the network is used as the POI name corresponding to the address information.
其中,对所挖掘的地址数据中POI信息的名称切词,并且统计切词后每个词出现的次数,同一个POI名称中出现频次最少即包含的信息量最大,并且是非地名的那个词记为该POI名称的关键词,比如表1中出现的地址数据对应的相关POI信息中POI名称切词后数据如表2所示(词频是根据约9000万的poi名字统计的),表2中第二列为获取到的关键词,具体如下:Among them, the name of the POI information in the mined address data is segmented, and the number of occurrences of each word after the word segmentation is counted. The least frequency of occurrence in the same POI name means the largest amount of information, and it is the word that is not a place name. It is the keyword of the POI name. For example, the POI name in the relevant POI information corresponding to the address data appearing in Table 1 is as shown in Table 2 (word frequency is calculated based on about 90 million POI names), and in Table 2 The second column is the obtained keywords, as follows:
表2POI名称的切词后的数据表Table 2 Data table after word segmentation of POI name
根据关键词聚类:同一个关键词对应的POI名称记为同一类,上述几个POI名称可以归为5个类,也就是说在此POI地址上存在5个不同的poi名字,分别为:Clustering based on keywords: the POI names corresponding to the same keyword are recorded as the same category, and the above POI names can be classified into 5 categories, that is to say, there are 5 different POI names on this POI address, which are:
A:保山博鑫源汽车贸易有限公司;A: Baoshan Boxinyuan Automobile Trading Co., Ltd.;
B:云南省澜沧江啤酒集团保山有限公司云南省澜沧江啤酒集团保山有限公司(地图标注);B: Yunnan Province Lancangjiang Beer Group Baoshan Co., Ltd. Yunnan Province Lancangjiang Beer Group Baoshan Co., Ltd. (marked on the map);
C:保山明志汽车销售有限公司保山明志汽车销售服务有限公司C: Baoshan Mingzhi Automobile Sales Co., Ltd. Baoshan Mingzhi Automobile Sales and Service Co., Ltd.
D:保山长城汽车4S店;D: Baoshan Great Wall Motor 4S store;
E:保山融易通汽车销售有限公司(雪佛兰4S店)。E: Baoshan Rongyitong Automobile Sales Co., Ltd. (Chevrolet 4S shop).
为了进一步体现发明的优越性,如下进一步揭示本发明基于网络中的地址数据确定POI信息有效性的系统中的统计单元12的在另一实施例中的内部结构,来体现依据统计单元12实现的另一实施例的细节。参照图2,统计单元12进一步包括POI信息来源获取模块121、POI信息来源可靠性判断模块122以及统计模块123:In order to further demonstrate the superiority of the invention, the internal structure of the statistical unit 12 in another embodiment of the system for determining the validity of POI information based on the address data in the network of the present invention is further disclosed as follows, to reflect the realization based on the statistical unit 12 Details of another embodiment. With reference to Fig. 2, statistical unit 12 further comprises POI information source acquisition module 121, POI information source reliability judgment module 122 and statistics module 123:
所述的POI信息来源获取模块121,用于获取所述POI信息的来源;The POI information source acquisition module 121 is configured to acquire the source of the POI information;
所述的POI信息来源可靠性判断模块122,用于判断所述来源是否属于可靠来源;The POI information source reliability judging module 122 is used to judge whether the source is a reliable source;
所述的统计模块123,用于在来源属于可靠来源的情况下统计所述POI信息在所述网络中的地址数据中的出现次数;否则不统计。The statistical module 123 is configured to count the occurrence times of the POI information in the address data in the network if the source is a reliable source; otherwise, do not count.
本实施例中,在同一类的POI名称中,选取最佳的POI名称是根据互联上的“投票”来解决,所谓“投票”主要是根据此POI名称在互联网上出现的频次以及来源的可信度,互联网上出现的频次最高、来源最可信的那个名字为要选取的最佳名字。比如:In this embodiment, among the POI names of the same type, selecting the best POI name is based on "voting" on the Internet. Reliability, the name with the highest frequency on the Internet and the most credible source is the best name to be selected. for example:
A类中只有一个名字,最佳的也是这一个。There is only one name in category A, and this is the best one.
B类中有两个名字,其中“云南省澜沧江啤酒集团保山有限公司”出现的频率最高,作为最佳名字。There are two names in category B, among which "Yunnan Province Lancangjiang Beer Group Baoshan Co., Ltd." appears most frequently and is regarded as the best name.
C类中有两个名字,其中“保山明志汽车销售服务有限公司”出现的频率最高,作为最佳名字。There are two names in category C, among which "Baoshan Mingzhi Automobile Sales and Service Co., Ltd." appears most frequently and is regarded as the best name.
D类和E类中同样是只有一个名字,类似A。Classes D and E also have only one name, similar to A.
为了进一步体现发明的优越性,如下进一步揭示本发明基于网络中的地址数据确定POI信息有效性的系统中的POI信息确定单元13的在另一实施例中的内部结构,来体现依据POI信息确定单元13实现的另一实施例的细节。参照图3,POI信息确定单元13进一步包括判断子单元131以及信息点信息确定子单元132:In order to further demonstrate the superiority of the invention, the internal structure of the POI information determination unit 13 in another embodiment of the system for determining the validity of POI information based on the address data in the network of the present invention is further disclosed as follows, to reflect the determination based on POI information Details of another embodiment implemented by unit 13. With reference to Fig. 3, POI information determination unit 13 further comprises judgment subunit 131 and information point information determination subunit 132:
所述的判断子单元131,用于判断所述POI信息在所述网络中的地址数据中的出现次数是否高于预定阈值;The judging subunit 131 is configured to judge whether the number of occurrences of the POI information in the address data in the network is higher than a predetermined threshold;
所述的信息点信息确定子单元132,用于在所述判断子单元判断为是的情况下,确定所获取的POI信息有效。The information point information determination subunit 132 is configured to determine that the acquired POI information is valid if the determination subunit determines yes.
本发明实施例中,POI信息在互联上出现的频率越高、来源的可信度越可信,则POI信息越可信。对最终选取的最佳POI名字根据其在互联上出现的频次以及来源来过滤,高于一定阈值的则为最终挖掘的可信的POI信息。In the embodiment of the present invention, the higher the frequency of POI information appearing on the Internet and the more credible the source is, the more credible the POI information is. The final selected best POI name is filtered according to its frequency of appearance on the Internet and its source, and those above a certain threshold are the credible POI information that is finally mined.
本发明实施例中,所述可靠来源为具有预定可信度的来源。其中,所述来源为网站或者网页。In the embodiment of the present invention, the reliable source is a source with predetermined credibility. Wherein, the source is a website or a web page.
本发明实施例中,预定可信度的来源的网站或者网页包括但不限于,如新浪、凤凰网等大型网站、通过官方认证的网站、访问频次比较高、数据流量大的网站以及不携带恶意链接、病毒链接且客户满意度交高的网站等。In the embodiment of the present invention, the website or webpage of the source of predetermined credibility includes, but is not limited to, such as Sina, Fenghuang.com and other large websites, websites that have passed official certification, websites with relatively high visit frequency and large data traffic, and websites that do not carry malicious Links, viral links, sites with high customer satisfaction, etc.
本发明实施例中,可信度是可量化的,可根据用户的访问次数以及客户评价等对各个网站或网页的可信度进行量化。而且各个网站或网页的可信度是动态变化的,若当前网站出现病毒、欺诈广告或被其他恶意欺诈网站所利用,则其可信度会随之降低,本发明通过网站可信度的量化和动态调整,进一步保证获取的POI信息的可靠、有效。In the embodiment of the present invention, the credibility is quantifiable, and the credibility of each website or web page can be quantified according to the user's visit times and customer evaluation. Moreover, the credibility of each website or webpage changes dynamically. If the current website has viruses, fraudulent advertisements, or is used by other malicious and fraudulent websites, its credibility will decrease accordingly. The present invention quantifies the credibility of the website and dynamic adjustment to further ensure the reliability and effectiveness of the acquired POI information.
本实施例对利用网络中的地址数据获取对应相同POI名称的多个相关POI信息,根据POI信息在网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息,从而使得用户能够快速、准确地搜索到同一经、纬度的POI地址对应的一个或多个POI名称,然后利用网络投票机制从一个或多个POI名称按照信息来源以及其在互联网上出现的频次进行过滤,选出可信度高的POI名称作为当前POI地址对应的POI名称,提高POI信息的有效性。In this embodiment, the address data in the network is used to obtain a plurality of related POI information corresponding to the same POI name, and the effective POI information corresponding to the same POI name is determined according to the number of occurrences of the POI information in the address data in the network, so that the user Can quickly and accurately search for one or more POI names corresponding to the POI address of the same latitude and longitude, and then use the online voting mechanism to filter one or more POI names according to the source of information and the frequency of their appearance on the Internet. A POI name with high reliability is used as the POI name corresponding to the current POI address to improve the validity of POI information.
图4示意性示出了本发明一个实施例的基于网络中的地址数据确定POI信息有效性的方法的流程图。Fig. 4 schematically shows a flowchart of a method for determining validity of POI information based on address data in a network according to an embodiment of the present invention.
参照图4,本发明实施例的基于网络中的地址数据确定POI信息有效性的方法包括以下步骤:With reference to Fig. 4, the method for determining the validity of POI information based on the address data in the network of the embodiment of the present invention comprises the following steps:
S11、利用网络中的地址数据获取对应相同POI名称的多个相关POI信息;S11. Using address data in the network to obtain multiple related POI information corresponding to the same POI name;
S12、统计所述POI信息在所述网络中的地址数据中的出现次数;S12. Count the number of occurrences of the POI information in the address data in the network;
S13、根据所述POI信息在所述网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息。S13. Determine valid POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network.
本发明实施例中,所述多个相关POI信息为对应POI至少一个预设属性的信息。其中,所述预设属性为经纬度、地址、建筑物名称或所囊括单位名称。In the embodiment of the present invention, the plurality of related POI information is information corresponding to at least one preset attribute of the POI. Wherein, the preset attribute is latitude and longitude, address, building name or included unit name.
本发明实施例,基于搜索引擎从网络数据中抓取地址数据,所述地址数据包括名称字段和地址信息,基于搜索引擎从互联网上挖掘的地图地址数据,比如name:恒大地产集团昆明公司;address:昆明市盘龙区北辰财富中心A座写字楼14楼,其中”恒大地产集团昆明公司”为POI的名称,“昆明市盘龙区北辰财富中心A座写字楼14楼”为此POI的地址,通过对地址的经纬度解析可以获得此地址所在的经纬度信息,比如地址“昆明市盘龙区北辰财富中心A座写字楼14楼”经纬度解析得到的经纬度为:东经:102.733445北纬:25.08108。另外,需要统计POI信息在互联网上出现的次数以及记录来源。In the embodiment of the present invention, the address data is captured from the network data based on the search engine, and the address data includes the name field and address information, and the map address data mined from the Internet based on the search engine, such as name: Evergrande Real Estate Group Kunming Company; address: 14th Floor, Office Building, Building A, Beichen Fortune Center, Panlong District, Kunming City, where "Evergrande Real Estate Group Kunming Company" is the name of the POI, and "14th Floor, Office Building A, Beichen Fortune Center, Panlong District, Kunming City" is the address of the POI , The longitude and latitude information of the address can be obtained by analyzing the longitude and latitude of the address. For example, the longitude and latitude of the address "14th Floor, Office Building, Building A, Beichen Fortune Center, Panlong District, Kunming City" is: East Longitude: 102.733445 North Latitude: 25.08108. In addition, it is necessary to count the number of times POI information appears on the Internet and the source of the record.
但是,在同一地理位置(经纬度相同)从不同来源网站获取的POI数据中,有可能存在重复性数据,即同一个地址(经纬度)可能存在多个POI名字,如,同一个经纬度存在多个公司,其实际的POI经度、纬度相同,但是POI名称和POI地址的描述方式却不同;还可以看出,同一个poi名字可能多种不同的说法,比如“保山明志汽车销售有限公司”和“保山明志汽车销售服务有限公司”,重复性的POI数据导致用户无法快速、准确的搜索到同一POI地理位置(经纬度)的POI地址对应的POI名称。However, in the POI data obtained from different source websites in the same geographical location (same latitude and longitude), there may be duplicate data, that is, there may be multiple POI names at the same address (latitude and longitude), for example, there are multiple companies at the same latitude and longitude , the actual POI longitude and latitude are the same, but the POI name and POI address are described in different ways; it can also be seen that the same POI name may have many different sayings, such as "Baoshan Mingzhi Automobile Sales Co., Ltd." and "Baoshan Mingzhi Automobile Sales and Service Co., Ltd.", the repetitive POI data makes it impossible for users to quickly and accurately search for the POI name corresponding to the POI address of the same POI geographic location (latitude and longitude).
对此,本发明实施例,对所挖掘的地址数据中POI信息的名称切词,并且统计切词后每个词出现的次数,同一个POI名称中出现频次最少即包含的信息量最大,并且是非地名的那个词记为该POI名称的关键词。In this regard, in the embodiment of the present invention, the name of the POI information in the excavated address data is cut into words, and the number of occurrences of each word after the word cut is counted, the least frequency of occurrence in the same POI name means the largest amount of information contained, and The word that is not a place name is recorded as the keyword of the POI name.
为了进一步体现发明的优越性,如下进一步揭示本发明基于网络中的地址数据确定POI信息有效性的方法中步骤S12的细分步骤,来体现依据本步骤实现的另一实施例。参照图5,本步骤的细分步骤包括:In order to further demonstrate the superiority of the invention, the subdivision steps of step S12 in the method for determining the validity of POI information based on the address data in the network of the present invention are further disclosed as follows to embody another embodiment implemented according to this step. Referring to Fig. 5, the subdivision steps of this step include:
S121、获取所述POI信息的来源;S121. Obtain the source of the POI information;
S122、判断所述来源是否属于可靠来源,如果是,则执行步骤S123;S122. Determine whether the source is a reliable source, and if yes, execute step S123;
S123、当所述来源属于可靠来源时,统计所述POI信息在所述网络中的地址数据里的出现次数,否则不统计。S123. When the source is a reliable source, count the number of occurrences of the POI information in the address data in the network; otherwise, do not count.
本实施例中,在同一类的POI名称中,选取最佳的POI名称是根据互联上的“投票”来解决,所谓“投票”主要是根据此POI名称在互联网上出现的频次以及来源的可信度,互联网上出现的频次最高、来源最可信的那个名字为要选取的最佳名字。In this embodiment, among the POI names of the same type, selecting the best POI name is based on "voting" on the Internet. Reliability, the name with the highest frequency on the Internet and the most credible source is the best name to be selected.
为了进一步体现发明的优越性,如下进一步揭示本发明基于网络中的地址数据确定POI信息有效性的方法中步骤S13的细分步骤,来体现依据本步骤实现的另一实施例。参照图6,本步骤的细分步骤包括:In order to further demonstrate the superiority of the invention, the subdivision steps of step S13 in the method for determining the validity of POI information based on the address data in the network of the present invention are further disclosed as follows to embody another embodiment implemented according to this step. Referring to Figure 6, the subdivision steps of this step include:
S131、判断所述POI信息在所述网络中的地址数据中的出现次数是否高于预定阈值;如果是,则执行步骤S132,S131. Judging whether the number of occurrences of the POI information in the address data in the network is higher than a predetermined threshold; if yes, execute step S132,
S132、确定所述POI信息有效。S132. Determine that the POI information is valid.
本发明实施例中,POI信息在互联上出现的频率越高、来源的可信度越可信,则POI信息越可信。对最终选取的最佳POI名字根据其在互联上出现的频次以及来源来过滤,高于一定阈值的则为最终挖掘的可信的POI信息。In the embodiment of the present invention, the higher the frequency of POI information appearing on the Internet and the more credible the source is, the more credible the POI information is. The final selected best POI name is filtered according to its frequency of appearance on the Internet and its source, and those above a certain threshold are the credible POI information that is finally mined.
本发明实施例中,所述可靠来源为具有预定可信度的来源。其中,所述来源为网站或者网页。In the embodiment of the present invention, the reliable source is a source with predetermined credibility. Wherein, the source is a website or a webpage.
本发明实施例中,预定可信度的来源的网站或者网页包括但不限于,如新浪、凤凰网等大型网站、通过官方认证的网站、访问频次比较高、数据流量大的网站以及不携带恶意链接、病毒链接且客户满意度交高的网站等。In the embodiment of the present invention, the website or webpage of the source of predetermined credibility includes, but is not limited to, such as Sina, Fenghuang.com and other large websites, websites that have passed official certification, websites with relatively high visit frequency and large data traffic, and websites that do not carry malicious Links, viral links, sites with high customer satisfaction, etc.
本发明实施例中,可信度是可量化的,可根据用户的访问次数以及客户评价等对各个网站或网页的可信度进行量化。而且各个网站或网页的可信度是动态变化的,若当前网站出现病毒、欺诈广告或被其他恶意欺诈网站所利用,则其可信度会随之降低,本发明通过网站可信度的量化和动态调整,进一步保证获取的POI信息的可靠、有效。In the embodiment of the present invention, the credibility is quantifiable, and the credibility of each website or web page can be quantified according to the user's visit times and customer evaluation. Moreover, the credibility of each website or webpage changes dynamically. If the current website has viruses, fraudulent advertisements, or is used by other malicious and fraudulent websites, its credibility will decrease accordingly. The present invention quantifies the credibility of the website and dynamic adjustment to further ensure the reliability and effectiveness of the acquired POI information.
通过采用本发明实施例提供的基于网络中的地址数据确定POI信息有效性的方法,根据切词后词频次的多少来挖掘poi名字的关键词,并且以此关键词来聚类,把不同说法的同一个poi名字聚为一类,解决同一个经纬度对应多个poi名字的问题,利用互联网“投票”机制来选取最佳的poi名字,利用互联上“投票”机制来选取可信的poi信息。By adopting the method for determining the validity of POI information based on the address data in the network provided by the embodiment of the present invention, the keywords of the POI name are mined according to the frequency of the words after the word segmentation, and the keywords are used to cluster, and different sayings Group the same POI name into one class, solve the problem of multiple POI names corresponding to the same latitude and longitude, use the Internet "voting" mechanism to select the best POI name, and use the "voting" mechanism on the Internet to select credible POI information .
综上所述,本发明对利用网络中的地址数据获取对应相同POI名称的多个相关POI信息,根据POI信息在网络中的地址数据中的出现次数确定对应所述相同POI名称的有效POI信息,从而使得用户能够快速、准确地搜索到同一经、纬度的POI地址对应的一个或多个POI名称,然后利用网络投票机制从一个或多个POI名称按照信息来源以及其在互联网上出现的频次进行过滤,选出可信度高的POI名称作为当前POI地址对应的POI名称,提高POI信息的有效性。To sum up, the present invention utilizes address data in the network to obtain a plurality of related POI information corresponding to the same POI name, and determines effective POI information corresponding to the same POI name according to the number of occurrences of the POI information in the address data in the network , so that users can quickly and accurately search for one or more POI names corresponding to POI addresses of the same latitude and longitude, and then use the network voting mechanism to select one or more POI names according to the source of information and the frequency of their appearance on the Internet Filter and select the POI name with high reliability as the POI name corresponding to the current POI address, so as to improve the validity of POI information.
应当注意,在此提供的算法和公式不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示例一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。It should be noted that the algorithms and formulas presented herein are not inherently related to any particular computer, virtual system, or other device. Various general systems can also be used with the examples based here. The structure required to construct such a system is apparent from the above description. Furthermore, the present invention is not specific to any particular programming language. It should be understood that various programming languages can be used to implement the content of the present invention described herein, and the above description of specific languages is for disclosing the best mode of the present invention.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.
类似地,应当理解,为了精简本发明并帮助理解本发明各个方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法和装置解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如权利要求书所反映,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, in order to streamline the present invention and to facilitate an understanding of one or more of its various aspects, various features of the invention are sometimes grouped together into a single embodiment , figure, or description of it. This disclosed method and apparatus, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art can understand that the modules in the device in the embodiment can be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore may be divided into a plurality of sub-modules or sub-units or sub-assemblies. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method or method so disclosed may be used in any combination, except that at least some of such features and/or processes or units are mutually exclusive. All processes or units of equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。Furthermore, those skilled in the art will understand that although some embodiments described herein include some features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention. and form different embodiments.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的网站安全检测设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) can be used in practice to implement some or all functions of some or all components in the website security detection device according to the embodiment of the present invention. The present invention can also be implemented as an apparatus or an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.
以上所述仅是本发明的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above descriptions are only part of the embodiments of the present invention. It should be pointed out that those skilled in the art can make some improvements and modifications without departing from the principles of the present invention. It should be regarded as the protection scope of the present invention.
Claims (10)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410849380.6A CN104572956B (en) | 2014-12-29 | 2014-12-29 | Determine the system and method for POI effectiveness |
| PCT/CN2015/095857 WO2016107352A1 (en) | 2014-12-29 | 2015-11-27 | System and method for determining poi name and for determining validity of poi information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410849380.6A CN104572956B (en) | 2014-12-29 | 2014-12-29 | Determine the system and method for POI effectiveness |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN104572956A true CN104572956A (en) | 2015-04-29 |
| CN104572956B CN104572956B (en) | 2016-10-12 |
Family
ID=53089018
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410849380.6A Active CN104572956B (en) | 2014-12-29 | 2014-12-29 | Determine the system and method for POI effectiveness |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN104572956B (en) |
Cited By (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105160032A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Method and device for determining confidence of point of interest data in website |
| CN105159885A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Point-of-interest name identification method and device |
| CN105224660A (en) * | 2015-09-30 | 2016-01-06 | 北京奇虎科技有限公司 | A kind of disposal route of map point of interest POI data and device |
| CN105279249A (en) * | 2015-09-30 | 2016-01-27 | 北京奇虎科技有限公司 | A method and device for determining the confidence level of point-of-interest data in a website |
| CN105608153A (en) * | 2015-12-18 | 2016-05-25 | 晶赞广告(上海)有限公司 | Universal POI information association method |
| WO2016107352A1 (en) * | 2014-12-29 | 2016-07-07 | 北京奇虎科技有限公司 | System and method for determining poi name and for determining validity of poi information |
| CN106528597A (en) * | 2016-09-23 | 2017-03-22 | 百度在线网络技术(北京)有限公司 | POI (Point Of Interest) labeling method and device |
| CN107369081A (en) * | 2017-07-19 | 2017-11-21 | 无锡企业征信有限公司 | The system and method for data validity is determined with the dynamic effects factor of data source |
| CN107729368A (en) * | 2017-09-08 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus for POI data verification |
| CN110020216A (en) * | 2017-07-20 | 2019-07-16 | 北京嘀嘀无限科技发展有限公司 | Destination method for pushing and device |
| CN110851696A (en) * | 2018-08-01 | 2020-02-28 | 北京京东尚科信息技术有限公司 | Interest point extraction method and device |
| CN111832483A (en) * | 2020-07-14 | 2020-10-27 | 北京百度网讯科技有限公司 | A method, device, device, and storage medium for identifying the validity of a point of interest |
| CN111854778A (en) * | 2019-09-09 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Method and system for evaluating rationality of geographic position description |
| CN112261570A (en) * | 2020-09-30 | 2021-01-22 | 汉海信息技术(上海)有限公司 | Method, device, server and storage medium for associating interest point with wireless network |
| CN112925774A (en) * | 2021-02-01 | 2021-06-08 | 大箴(杭州)科技有限公司 | Method and device for cleaning address data, storage medium and computer equipment |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6947920B2 (en) * | 2001-06-20 | 2005-09-20 | Oracle International Corporation | Method and system for response time optimization of data query rankings and retrieval |
| CN102479229A (en) * | 2010-11-29 | 2012-05-30 | 北京四维图新科技股份有限公司 | Point of interest data generation method and system |
| CN104077295A (en) * | 2013-03-27 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Data label mining method and data label mining system |
| CN104216895A (en) * | 2013-05-31 | 2014-12-17 | 高德软件有限公司 | Method and device for generating POI data |
-
2014
- 2014-12-29 CN CN201410849380.6A patent/CN104572956B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6947920B2 (en) * | 2001-06-20 | 2005-09-20 | Oracle International Corporation | Method and system for response time optimization of data query rankings and retrieval |
| CN102479229A (en) * | 2010-11-29 | 2012-05-30 | 北京四维图新科技股份有限公司 | Point of interest data generation method and system |
| CN104077295A (en) * | 2013-03-27 | 2014-10-01 | 百度在线网络技术(北京)有限公司 | Data label mining method and data label mining system |
| CN104216895A (en) * | 2013-05-31 | 2014-12-17 | 高德软件有限公司 | Method and device for generating POI data |
Cited By (20)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016107352A1 (en) * | 2014-12-29 | 2016-07-07 | 北京奇虎科技有限公司 | System and method for determining poi name and for determining validity of poi information |
| CN105279249B (en) * | 2015-09-30 | 2019-06-21 | 北京奇虎科技有限公司 | A method and device for determining the confidence of point of interest data in a website |
| CN105224660A (en) * | 2015-09-30 | 2016-01-06 | 北京奇虎科技有限公司 | A kind of disposal route of map point of interest POI data and device |
| CN105279249A (en) * | 2015-09-30 | 2016-01-27 | 北京奇虎科技有限公司 | A method and device for determining the confidence level of point-of-interest data in a website |
| CN105160032A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Method and device for determining confidence of point of interest data in website |
| CN105159885A (en) * | 2015-09-30 | 2015-12-16 | 北京奇虎科技有限公司 | Point-of-interest name identification method and device |
| CN105160032B (en) * | 2015-09-30 | 2019-05-31 | 北京奇虎科技有限公司 | The determination method and device of the confidence level of interest point data in a kind of website |
| CN105608153A (en) * | 2015-12-18 | 2016-05-25 | 晶赞广告(上海)有限公司 | Universal POI information association method |
| CN106528597B (en) * | 2016-09-23 | 2019-07-05 | 百度在线网络技术(北京)有限公司 | The mask method and device of point of interest |
| CN106528597A (en) * | 2016-09-23 | 2017-03-22 | 百度在线网络技术(北京)有限公司 | POI (Point Of Interest) labeling method and device |
| CN107369081A (en) * | 2017-07-19 | 2017-11-21 | 无锡企业征信有限公司 | The system and method for data validity is determined with the dynamic effects factor of data source |
| CN110020216A (en) * | 2017-07-20 | 2019-07-16 | 北京嘀嘀无限科技发展有限公司 | Destination method for pushing and device |
| CN107729368A (en) * | 2017-09-08 | 2018-02-23 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus for POI data verification |
| CN110851696A (en) * | 2018-08-01 | 2020-02-28 | 北京京东尚科信息技术有限公司 | Interest point extraction method and device |
| CN111854778A (en) * | 2019-09-09 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Method and system for evaluating rationality of geographic position description |
| CN111854778B (en) * | 2019-09-09 | 2022-05-17 | 北京嘀嘀无限科技发展有限公司 | Method and system for evaluating rationality of geographic position description |
| CN111832483A (en) * | 2020-07-14 | 2020-10-27 | 北京百度网讯科技有限公司 | A method, device, device, and storage medium for identifying the validity of a point of interest |
| CN111832483B (en) * | 2020-07-14 | 2024-03-08 | 北京百度网讯科技有限公司 | A method, device, equipment and storage medium for identifying the validity of points of interest |
| CN112261570A (en) * | 2020-09-30 | 2021-01-22 | 汉海信息技术(上海)有限公司 | Method, device, server and storage medium for associating interest point with wireless network |
| CN112925774A (en) * | 2021-02-01 | 2021-06-08 | 大箴(杭州)科技有限公司 | Method and device for cleaning address data, storage medium and computer equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN104572956B (en) | 2016-10-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104572956B (en) | Determine the system and method for POI effectiveness | |
| CN104572955B (en) | A kind of system and method determining POI title based on cluster | |
| CN104572957B (en) | A kind of POI title based on cluster determines system and method | |
| JP6091736B2 (en) | Method and system for evaluating the quality of location content | |
| CN102163214B (en) | A device and method for generating a digital map | |
| WO2016155386A1 (en) | Method and device for determining whether webpage comprises point of interest (poi) data | |
| CN103823900B (en) | Information point importance determines method and apparatus | |
| WO2016107352A1 (en) | System and method for determining poi name and for determining validity of poi information | |
| WO2017008653A1 (en) | Poi service provision method, poi data processing method and device | |
| CN105183908A (en) | Point of interest (POI) data classifying method and device | |
| JP2015014859A (en) | Poi information provision system, poi information provision device, poi information output device, poi information provision method, and program | |
| CN107368480A (en) | A kind of interest point data type of error positioning, repeat recognition methods and device | |
| US20130031458A1 (en) | Hyperlocal content determination | |
| CN105069079B (en) | Method and device for screening POI (Point of interest) data | |
| CN107563789A (en) | Data processing method, system, terminal and computer-readable recording medium | |
| CN104572954B (en) | A system and method for verifying map point-of-interest information by mail delivery | |
| CN110647606A (en) | Map icon display method and device | |
| CN104063437A (en) | Service information issuing and searching device and method based on electronic map | |
| CN113495997A (en) | Method and device for searching alias of POI (Point of interest) and vehicle | |
| CN105279249A (en) | A method and device for determining the confidence level of point-of-interest data in a website | |
| CN104899339A (en) | Method and system for classifying POI (Point of Interest) information | |
| CN105320752B (en) | A kind of method for digging and device of interest point data | |
| CN104537041B (en) | A kind of definite user's query word whether the method and system of invocation map interface | |
| KR101623739B1 (en) | Method for generating a point of interest database and system for performing the method | |
| JP2014203271A (en) | Target store visit facility information providing device, method, and program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C41 | Transfer of patent application or patent right or utility model | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20160912 Address after: 201, room 518000, building A, No. 1, front Bay Road, former seaport cooperation area, Guangdong, Shenzhen (Shenzhen Qianhai business secretary Co., Ltd.) Applicant after: SHENZHEN QIHU INTELLIGENT TECHNOLOGY CO.,LTD. Address before: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park) Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd. Applicant before: Qizhi software (Beijing) Co.,Ltd. |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CP03 | Change of name, title or address |
Address after: 518000, 3rd Floor, Building A2, Nanshan Zhiyuan, No. 1001 Xueyuan Avenue, Changyuan Community, Taoyuan Street, Nanshan District, Shenzhen, Guangdong Province Patentee after: Shenzhen 3600 Smart Life Technology Co.,Ltd. Country or region after: China Address before: Room 201, Building A, No. 1 Qianwan Road, Qianhai Harbour Cooperation Zone, Shenzhen, Guangdong 518000 Patentee before: SHENZHEN QIHU INTELLIGENT TECHNOLOGY CO.,LTD. Country or region before: China |
|
| CP03 | Change of name, title or address |