[go: up one dir, main page]

WO2008133368A1 - Système pour le classement de recherche d'informations et procédé basé sur des niveaux d'attention d'utilisateurs - Google Patents

Système pour le classement de recherche d'informations et procédé basé sur des niveaux d'attention d'utilisateurs Download PDF

Info

Publication number
WO2008133368A1
WO2008133368A1 PCT/KR2007/002701 KR2007002701W WO2008133368A1 WO 2008133368 A1 WO2008133368 A1 WO 2008133368A1 KR 2007002701 W KR2007002701 W KR 2007002701W WO 2008133368 A1 WO2008133368 A1 WO 2008133368A1
Authority
WO
WIPO (PCT)
Prior art keywords
document
attention
action
user
rank
Prior art date
Application number
PCT/KR2007/002701
Other languages
English (en)
Inventor
Soo Jung Park
Original Assignee
Onnet Mns Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Onnet Mns Co., Ltd. filed Critical Onnet Mns Co., Ltd.
Publication of WO2008133368A1 publication Critical patent/WO2008133368A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present invention relates to technology for calculating users' attention levels to a document on the basis of a User Action Log (UAL) and applying the users' attention levels to the ranking of search results .
  • UAL User Action Log
  • an information search system generates the results of a search by indexing documents matching a keyword entered by a user.
  • Documents included in the results of a search are provided in the form of a ranked list (a set of links indicating documents) through a statistical technique, such as content analysis or link analysis .
  • a document such as a web page, includes content and metadata.
  • Content has an inclusive meaning including audio and video files as well as text.
  • Metadata may include various attributes, such as a document language, a document title, a document size, a document identifier (for example, Uniform Resource Locator [URL] information) , a document format, category, and other attributes.
  • the content and metadata of documents, and information about the relation information between documents are generally used.
  • information is described from the standpoint of an information provider who creates content or desires to distribute content, so that a description made from the standpoint of end consumers who consume content is not considered.
  • user-centered information such as attractive content or currently popular content, is excluded from the factors determining ranking, but only provider-centered factors, such as document titles or backlinks, are used.
  • a representative information search system is 'Google' .
  • a search process is performed in such a way that link information (hyperlinks) , indicating a given document, is analyzed in addition to information included in the document on the basis of a 'PageRank' technique, a PageRank value is assigned to the given document, and analysis information contained in the document and the above assigned PageRank value (ranging from 0 to 10) are summed, and thus ranked search results are provided.
  • link information hyperlinks
  • PageRank value ranging from 0 to 10
  • an object of the present invention is to provide an information search ranking system and method, which convert actions taken by a user on each individual document (content) into an attention level, assign the attention level to a given document, and apply the assigned attention level to the ranking of search results, thus providing improved search results to the user.
  • the present invention provides an information search ranking method fundamentally applied to a system including a search engine for searching documents, stored in a document DB, for a desired document in response to a search request and providing ranked search results .
  • the information search ranking method based on users' attention levels comprises a step of collecting a plurality of documents over an information network and storing the documents in a document DB, a step of collecting and storing User Action Logs (UALs) from at least one user terminal or at least one network provider server, a step of calculating an Attention Rank (AR) from a multiplication of an Attention Value (AV) for all actions taken by each user by an Influence Value (IV) of the user for each action, with respect to all users who access an individual stored document, on a basis of the collected UALs, and storing the calculated AR in an attention rank DB, and a step of calculating a Rank Value (RV) with reference to the AR stored in the attention rank DB, with respect to each document searched in the document DB in response to a keyword-based search request received from the user terminal, thus providing ranked search results .
  • UALs User Action Logs
  • AR Attention Rank
  • AR Attention Rank
  • AV Attention Value
  • IV Influence Value
  • the Attention Value (AV) may be calculated by multiplying a predetermined weight ( w k ) , assigned to each action taken by a certain user accessing the individual document, by a sigmoid function that uses an elapsed time from the action (t) as a variable.
  • the Influence Value (IV) may be calculated by the following equation:
  • is a sigmoid function
  • c h is a total number of all actions taken by a user h
  • m is a value obtained by dividing a sum of values c h by a total number of users .
  • the present invention provides an information search ranking system and method, which use actions of users taken on an individual document as users' attention levels based on the memory model of a human being, thus providing excellent ranked search results sensitive to the users' attention or preference in response to a keyword- based search request. Further, the present invention can provide excellent ranked search results, even for the recent rapid proliferation of User Created Content (UCC) .
  • UCC User Created Content
  • FIG. 1 is a diagram showing the overall system to which the technical spirit of the present invention is applied;
  • FIG. 2 is a diagram showing the construction of an information search ranking system according to the present invention.
  • FIG. 3 is a diagram showing the detailed construction of an attention rank calculation module according to the present invention.
  • FIG. 4 is a flowchart of an information search ranking method according to the present invention.
  • FIG. 1 is a diagram showing the overall system to which the present invention is applied.
  • the system includes a document-using user terminal 100 for viewing documents provided by one or more content (document) servers (not shown) connected to an information network, and creating a User Action Log (UAL) for the viewed document, an information search ranking system 200 for collecting and indexing a plurality of documents over the information network, collecting UALs from the document-using user terminal or a network provider server (not shown) , calculating an Attention Value (AV) on the basis of the collected UALs, and assigning an Attention Rank (AR) to each collected document, and a document-search user terminal 300 for requesting the information search ranking system 200 to search for a keyword, and receiving ranked search results, in which the user's attention level is taken into account, from the information search ranking system 200.
  • a document-using user terminal 100 for viewing documents provided by one or more content (document) servers (not shown) connected to an information network, and creating a User Action Log (UAL) for the viewed document
  • an information search ranking system 200 for collecting and indexing a plurality of documents over the information network, collecting
  • user terminals include mobile phones or computers enabling Internet communication, and are classified into a 'document-using user terminal' and a 'document-search user terminal' for convenience of description.
  • the 'document-using user terminal' is defined based on the function of viewing documents (for example, web surfing) provided in a plurality of content servers connected to the information network, and creating a UAL based on the viewed documents.
  • the 'document-search user terminal' is defined based on the function of storing Attention Ranks (ARs) for respective documents on the basis of a plurality of UALs collected by the information search ranking system 200, and then receiving ranked search results matching a keyword from the information search ranking system 200.
  • ARs Attention Ranks
  • both the user terminals can be considered to be the same 'user terminal' .
  • 'User Action Log means a log file obtained by recording actions taken by the user (user actions) when the document-using user terminal views a certain document, and includes 1) a document identifier, 2) an action identifier, 3) an action type, 4) an action time, and 5) supplementary data.
  • the 'document identifier' is an identifier for a target document on which an action is taken.
  • the 'action identifier' is information for identifying a user who took an action, and may be, for example, an IP address or a Medium Access Control (MAC) address when the user uses a computer, or may be, for example, the phone number or unique information of a mobile phone when the user uses a mobile phone .
  • MAC Medium Access Control
  • the 'action type' includes various types of actions, such as view (action of viewing summary information about a document, previewing a document, etc.), play (action of playing video, music, images, etc.), detailed view (action of viewing the full text of a document rather than summary information) , save (action of saving or storing a document) , buy (action of purchasing a document or content for sale) , recommend (action of recommending a document or content to another person) , evaluate (action of representing an individual's opinion about a document or content as digitized or standardized information) , attach supplementary information (action of attaching supplementary information, such as a comment or tag, to a document or content) , and bookmark
  • the 'supplementary data' may include environmental information about the action taken by the user, for example, the location of the user, the environmental conditions (travel) , etc.
  • FIG. 2 is a diagram showing the detailed construction of the information search ranking system 200 according to an embodiment of the present invention.
  • the information search ranking system 200 includes a document collection module 210, a search engine 220, a UAL collection module 230, an attention value calculation module 240, a rank value calculation module 250, a document DB 10, a UAL DB 20, and an attention rank DB 30.
  • the document collection module 210 collects a plurality of documents over the information network, and stores the documents in the document DB 10 so that the documents are indexed and stored in order to respond to a keyword-based search request (a search request received from the document- search user terminal) .
  • the search engine 220 is an engine provided with functions required for a typical search operation of searching the document DB 10 for an input keyword and providing search results.
  • the search engine 220 provides ranked search results while referring to the Attention Ranks (ARs) of the attention rank DB 30, respectively related to the documents constituting the search results, assigns Rank Values (RVs) , in which the 'users' attention levels' are taken into account, to the documents, and provides the documents, to which the RVs are assigned, in the form of a ranked list. This procedure is described in detail below.
  • the 'user's attention level' is a principal factor for evaluating documents (content) , and is obtained by digitizing a user' s attention level using a memory model of the user.
  • a human being recognizes a certain fact or an object, and then gradually forgets the fact or the object as time elapses.
  • This is the memory model of a human being, represented by a sigmoid function.
  • link information the PageRank of Google
  • the total number of hits for a given document is generally used as an index.
  • such an index is only a simple accumulative index, which does not take into account the temporal component of the memory model .
  • the UAL collection module 230 collects User Action Logs (UALs) from the document-using user terminal 100 or the network provider server, and stores and manages the UALs in the UAL DB 20.
  • the UALs can be collected from a logging tool (software organized in a web browser) installed in the document-using user terminal 100, or from UALs stored in the network provider server. It will be apparent that the two collection methods can be used together.
  • the attention rank calculation module 240 includes the attention value calculation unit 242 and the action influence value calculation unit 244, thus calculating an Attention Rank (AR) from the multiplication of an Attention Value (AV) for all actions taken by each user by the Influence Value (IV) of the user for each action, with respect to all users who access an individual document (p of the following Equation [1] ) , and updating the calculated AR in the attention rank DB 30.
  • the Attention Rank (AR) is defined by the following Equation [1] .
  • AR(p) ⁇ [AV(e hpk )xIV(h)] [1] forall h,k
  • Equation [1] the AV, calculated by the attention value calculation unit 242, is represented by the following Equation [2] , where e h pk means that a user h takes an action I on a document p , w k is a predetermined weight previously assigned to the type of action k , t is the difference between the time at which the AV is calculated and the time at which the action k actually occurs, that is, the elapsed time from the action k (hereinafter referred to as an 'elapsed time from an action' ) .
  • the weight has a value ranging from 0 to 1 according to the type of action.
  • a weight can be defined as 0.2, for 'play', as 0.4, for 'detailed view', as 0.5, and for 'recommend', as 0.9.
  • Equation [2] is a sigmoid function having the elapsed time from an action t as a variable, as defined in the right term thereof, and is implemented by modeling the memory of a human being, as described above. Therefore, the weight w k decreases as the elapsed time from the action increases.
  • the influence value (IV) of the user who took the action must be taken into account in the AV calculated by the attention value calculation unit 242. For example, if it is assumed that there is a user who took 1,000 actions and a user who took 10 actions, it is determined that the former user participates in the document more than the latter user by a factor of 100 times. In other words, when users' attention levels paid to the corresponding document are calculated, the user who took 1,000 actions has an excessively large influence on the document, compared to the remaining users. Therefore, a correction that allows the influence value (IV) of the user calculated by the action influence value calculation unit 244 to be taken into account in the AV must be performed.
  • the influence value (IV) of the user for a given action is represented by the following Equation [3] ,
  • is the sigmoid function
  • c h is the total number of all actions taken by the user h
  • m is the value obtained by dividing the sum of values c h by the total number of users
  • Equation [3] can be summarized as number of users
  • the rank value calculation module 250 calculates a Rank Value (RV) with reference to the AR for each extracted document, and thus ranks the documents.
  • RV Rank Value
  • the AR can be used as a factor for calculating the RV, but the degree of relation of each document to a keyword can be additionally calculated using the following Equation [4] according to the circumstances,
  • RV(p) DocRel(p)xAR(p) [4]
  • DocRel is the degree of relation of a predetermined document to the keyword.
  • the document collection module 210 collects and stores a plurality of documents over the information network at step SlOO . This step is performed by a typical information search system at predetermined periods .
  • the UAL collection module 230 collects UALs for documents from the document-using user terminal 100 or the network provider server, and stores the UALs at step S200.
  • the attention rank calculation module 240 calculates an AR in consideration of both the AV for all actions, taken by each user, and the IV of the user for each action, with respect to all users who took actions on each document (each document stored in the document DB) , on the basis of the UALs collected at step S200, at step S300.
  • the search engine 220 searches the document DB 10 for documents matching a keyword at step S400.
  • the rank value calculation module 250 calculates the Rank Values (RVs) of found documents with reference to ARs for respective found documents stored in the attention rank DB 30, and ranks the found documents at step S500.
  • RVs Rank Values
  • the search engine 220 provides the ranked search results to the document-search user terminal 300 that requested the search at step S600. Through a series of these steps, a querying person (user) can view excellent search results, in which the 'user's attention levels' are taken into account .
  • the present invention provides an information search ranking system and method, which use actions of users taken on an individual document as users' attention levels based on the memory model of a human being, thus providing excellent ranked search results sensitive to the users' attention or preference in response to a keyword- based search request. Further, the present invention can provide excellent ranked search results, even for the recent rapid proliferation of User Created Content (UCC) .
  • UCC User Created Content

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un système de classement de recherche d'informations et un procédé basé sur des niveaux d'attention d'utilisateurs. Ce système de classement de recherche d'informations comprend un module de collecte de journal d'action d'utilisateur (UAL) qui recueille plusieurs journaux UAL d'au moins un terminal d'utilisateur ou d'au moins un serveur de fournisseur de réseau. Un module de calcul de classement de l'attention (240) calcule un rang d'attention (AR) par multiplication d'une valeur d'attention (AV) pour toutes les actions entreprises par chaque utilisateur par une valeur d'influence (IV) de l'utilisateur pour chaque action, par rapport à tous les utilisateurs accédant à un document individuel, sur la base des journaux UAL recueillis et il met à jour le rang d'attention calculé. Un module de calcul de valeur de classement (252) calcule une valeur de classement (RV) en référence au rang d'attention, suivant chaque document recherché par le moteur de recherche dans la base de données de documents suite à une demande de recherche basée sur des mots clés, ce qui permet de classer les documents.
PCT/KR2007/002701 2007-04-30 2007-06-04 Système pour le classement de recherche d'informations et procédé basé sur des niveaux d'attention d'utilisateurs WO2008133368A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2007-0041741 2007-04-30
KR1020070041741A KR100923505B1 (ko) 2007-04-30 2007-04-30 사용자 관심도를 반영한 정보검색 랭킹 시스템 및 그 방법

Publications (1)

Publication Number Publication Date
WO2008133368A1 true WO2008133368A1 (fr) 2008-11-06

Family

ID=39925803

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2007/002701 WO2008133368A1 (fr) 2007-04-30 2007-06-04 Système pour le classement de recherche d'informations et procédé basé sur des niveaux d'attention d'utilisateurs

Country Status (2)

Country Link
KR (1) KR100923505B1 (fr)
WO (1) WO2008133368A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011139491A3 (fr) * 2010-04-30 2012-02-09 Microsoft Corporation Hiérarchisation de ressources sur la base des activités d'utilisateurs
US8819009B2 (en) 2011-05-12 2014-08-26 Microsoft Corporation Automatic social graph calculation
US8979538B2 (en) 2009-06-26 2015-03-17 Microsoft Technology Licensing, Llc Using game play elements to motivate learning
US9477574B2 (en) 2011-05-12 2016-10-25 Microsoft Technology Licensing, Llc Collection of intranet activity data
US9697500B2 (en) 2010-05-04 2017-07-04 Microsoft Technology Licensing, Llc Presentation of information describing user activities with regard to resources
WO2022187478A1 (fr) * 2021-03-03 2022-09-09 Home Depot International, Inc. Réseau de rétroaction de pseudo-pertinence d'attention pour la catégorisation de demandes

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101078864B1 (ko) 2009-03-26 2011-11-02 한국과학기술원 질의/문서 주제 범주 변화 분석 시스템 및 그 방법과 이를 이용한 질의 확장 기반 정보 검색 시스템 및 그 방법
KR101218141B1 (ko) * 2011-09-27 2013-01-03 (주)레드테이블 랭킹 계산 방법 및 시스템
KR101894419B1 (ko) * 2011-12-12 2018-09-04 에스케이플래닛 주식회사 개인화된 정보 제공 시스템, 방법 및 그에 대한 기록매체
KR101624284B1 (ko) * 2014-08-06 2016-06-08 네이버 주식회사 정보 제공 시스템 및 방법
KR101649146B1 (ko) 2015-01-15 2016-08-19 주식회사 카카오 검색 방법 및 검색 서버
KR101593876B1 (ko) * 2015-04-27 2016-02-16 연세대학교 산학협력단 인터넷 검색량 정보를 이용하여 환경영향 범주별 가중치들을 산정하는 환경영향 평가 방법 및 시스템
CN105243124B (zh) * 2015-09-29 2018-11-09 百度在线网络技术(北京)有限公司 资源组合处理方法及装置
KR102008386B1 (ko) * 2017-09-14 2019-08-08 인하대학교 산학협력단 재현율 기반의 특허 검색 엔진 평가 시스템 및 그 방법
KR102008387B1 (ko) * 2018-04-30 2019-08-07 인하대학교 산학협력단 비재현율 기반의 특허 검색 엔진 평가 시스템 및 그 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020043679A (ko) * 2000-12-02 2002-06-12 김동현 사용자의 성향에 따른 연속성을 갖는 검색 지원 시스템 및방법과 그 프로그램 소스를 기록한 기록매체
US6665655B1 (en) * 2000-04-14 2003-12-16 Rightnow Technologies, Inc. Implicit rating of retrieved information in an information search system
KR20040006515A (ko) * 2002-07-12 2004-01-24 주식회사 네오위즈 사용자가 입력하는 정보와 행동로그의 분석을 이용하여정보 서비스 체계 및 검색 결과를 제공하는 정보 서비스시스템 및 방법
KR20050095230A (ko) * 2004-03-25 2005-09-29 주식회사 첫눈 사용자 방문 유알엘 로그를 이용한 정보 서비스 및 정보검색 서비스 제공 방법 및 시스템
KR20060050397A (ko) * 2004-10-05 2006-05-19 마이크로소프트 코포레이션 개인화된 검색 및 정보 액세스를 제공하기 위한 시스템,방법 및 인터페이스

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6665655B1 (en) * 2000-04-14 2003-12-16 Rightnow Technologies, Inc. Implicit rating of retrieved information in an information search system
KR20020043679A (ko) * 2000-12-02 2002-06-12 김동현 사용자의 성향에 따른 연속성을 갖는 검색 지원 시스템 및방법과 그 프로그램 소스를 기록한 기록매체
KR20040006515A (ko) * 2002-07-12 2004-01-24 주식회사 네오위즈 사용자가 입력하는 정보와 행동로그의 분석을 이용하여정보 서비스 체계 및 검색 결과를 제공하는 정보 서비스시스템 및 방법
KR20050095230A (ko) * 2004-03-25 2005-09-29 주식회사 첫눈 사용자 방문 유알엘 로그를 이용한 정보 서비스 및 정보검색 서비스 제공 방법 및 시스템
KR20060050397A (ko) * 2004-10-05 2006-05-19 마이크로소프트 코포레이션 개인화된 검색 및 정보 액세스를 제공하기 위한 시스템,방법 및 인터페이스

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8979538B2 (en) 2009-06-26 2015-03-17 Microsoft Technology Licensing, Llc Using game play elements to motivate learning
WO2011139491A3 (fr) * 2010-04-30 2012-02-09 Microsoft Corporation Hiérarchisation de ressources sur la base des activités d'utilisateurs
CN102870113A (zh) * 2010-04-30 2013-01-09 微软公司 基于用户活动的对资源的优先化
JP2013529332A (ja) * 2010-04-30 2013-07-18 マイクロソフト コーポレーション ユーザアクティビティに基づくリソースの優先順位付け
US9697500B2 (en) 2010-05-04 2017-07-04 Microsoft Technology Licensing, Llc Presentation of information describing user activities with regard to resources
US8819009B2 (en) 2011-05-12 2014-08-26 Microsoft Corporation Automatic social graph calculation
US9477574B2 (en) 2011-05-12 2016-10-25 Microsoft Technology Licensing, Llc Collection of intranet activity data
WO2022187478A1 (fr) * 2021-03-03 2022-09-09 Home Depot International, Inc. Réseau de rétroaction de pseudo-pertinence d'attention pour la catégorisation de demandes
US11960555B2 (en) 2021-03-03 2024-04-16 Home Depot Product Authority, Llc Attentive pseudo-relevance feedback network for query categorization

Also Published As

Publication number Publication date
KR100923505B1 (ko) 2009-11-02
KR20080096887A (ko) 2008-11-04

Similar Documents

Publication Publication Date Title
WO2008133368A1 (fr) Système pour le classement de recherche d'informations et procédé basé sur des niveaux d'attention d'utilisateurs
US12124404B2 (en) Method of and system for enhanced local-device content discovery
CN101025737B (zh) 基于关注度的同源信息搜索引擎聚合显示方法
US8818995B1 (en) Search result ranking based on trust
CN100485677C (zh) 搜索结果中放置内容排序的个性化
US8589373B2 (en) System and method for improved searching on the internet or similar networks and especially improved MetaNews and/or improved automatically generated newspapers
US8793265B2 (en) Method and system for selecting personalized search engines for accessing information
AU2011201819B2 (en) Propagating useful information among related web pages, such as web pages of a website
US8645367B1 (en) Predicting data for document attributes based on aggregated data for repeated URL patterns
US20060064411A1 (en) Search engine using user intent
US20140047009A1 (en) Prereading method and system for web browser
US20090037421A1 (en) Traffic Predictor for Network-Accessible Information Modules
US9275145B2 (en) Electronic document retrieval system with links to external documents
US7216122B2 (en) Information processing device and method, recording medium, and program
CN101568921A (zh) 数字内容的动态定价模型
US20100293448A1 (en) Centralized website local content customization
US20180032614A1 (en) System And Method For Compiling Search Results Using Information Regarding Length Of Time Users Spend Interacting With Individual Search Results
CN103744856A (zh) 联动性扩展搜索方法及装置、系统
US20100049762A1 (en) Electronic document retrieval system
JP4875911B2 (ja) コンテンツ特定方法及び装置
CN112868003A (zh) 使用用户互动度的基于实体的搜索系统
CN113259150B (zh) 一种数据处理方法、系统以及存储介质
Wang et al. A personalization-oriented academic literature recommendation method
JP2006318398A (ja) ベクトル生成方法及び装置及び情報分類方法及び装置及びプログラム及びプログラムを格納したコンピュータ読み取り可能な記憶媒体
KR20050063886A (ko) 사용자의 요청에 따른 컨텐츠 공급 방법 및 시스템

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07746831

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07746831

Country of ref document: EP

Kind code of ref document: A1