CN105488697A - Potential customer mining method based on customer behavior characteristics - Google Patents
Potential customer mining method based on customer behavior characteristics Download PDFInfo
- Publication number
- CN105488697A CN105488697A CN201510903856.4A CN201510903856A CN105488697A CN 105488697 A CN105488697 A CN 105488697A CN 201510903856 A CN201510903856 A CN 201510903856A CN 105488697 A CN105488697 A CN 105488697A
- Authority
- CN
- China
- Prior art keywords
- page
- data
- feature
- user
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Game Theory and Decision Science (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明涉及通过分析网站用户访问日志挖掘潜在客户领域,具体而言,涉及一种基于客户行为特征的潜在客户挖掘方法。The invention relates to the field of mining potential customers by analyzing website user access logs, in particular to a potential customer mining method based on customer behavior characteristics.
背景技术Background technique
目前在市场竞争日益激烈的电子商务时代,不断拓展更多的新客户,从众多的浏览者有效挖掘出潜在客户群体,并努力将潜在客户转换成为现实客户,企业就能获得更多效益以及市场竞争优势。潜在客户挖掘的目的就是为网站制定相应的服务策略提供准确的参考依据及做出相应决策。At present, in the era of increasingly fierce market competition in the e-commerce era, by continuously expanding more new customers, effectively digging out potential customer groups from numerous browsers, and working hard to convert potential customers into actual customers, enterprises can gain more benefits and market Competitive Advantage. The purpose of potential customer mining is to provide accurate references and make corresponding decisions for the website to formulate corresponding service strategies.
潜在客户挖掘的基础数据来源于网站的访问日志,访问日志记录了客户访问某一站点的访问行为信息,这些信息是易于获取的。The basic data of potential customer mining comes from the website's access log, which records the customer's visit behavior information when visiting a certain site, and this information is easy to obtain.
访问日志里记录了访客的IP、登陆ID、访问时间、VINFO、浏览产品ID、REFERER(上一次访问的页面)、REQUEST(访问的页面)、SEARCH_WORD(搜索词)、PROD_ID(浏览产品)、ORDER_ID等信息。The access log records the visitor's IP, login ID, access time, VINFO, browsed product ID, REFERER (last visited page), REQUEST (visited page), SEARCH_WORD (search term), PROD_ID (browsed product), ORDER_ID and other information.
表1访问日志信息Table 1 Access log information
其中,REFERER和REQUEST是分析访问来源、访问去向和判断访客是否意向购买,是否加入购物车等行为的最主要的信息。对这些行为进行挖掘提取出客户的行为特征,这些行为特征能够较为有效的反映客户的类别,比如具有哪种访问行为特征的客户是忠诚客户,具有哪种访问行为特征的客户是纯粹参观者,具有哪种访问行为特征的客户是潜在客户。Among them, REFERER and REQUEST are the most important information for analyzing the source of the visit, the destination of the visit, and judging whether the visitor intends to purchase, whether to add to the shopping cart, etc. Mining these behaviors extracts the behavioral characteristics of customers, which can more effectively reflect the category of customers, such as customers with which kind of visiting behavior characteristics are loyal customers, customers with which visiting behavior characteristics are pure visitors, Customers with which visit behavior characteristics are potential customers.
因此如何从网站的访问日志中找出潜在客户的行为特征就是挖掘潜在客户需要解决的问题。Therefore, how to find out the behavior characteristics of potential customers from the website access log is the problem that potential customers need to solve.
发明内容Contents of the invention
发明目的:本发明提供一种基于客户行为特征的潜在客户挖掘方法,为网站制定相应的服务策略,并提供准确的参考依据及做出相应决策。Purpose of the invention: The present invention provides a potential customer mining method based on customer behavior characteristics, formulates corresponding service strategies for websites, and provides accurate references and makes corresponding decisions.
技术方案:为实现上述目的,本发明采用的技术方案为:一种基于客户行为特征的潜在客户挖掘方法,其特征在于:Technical solution: In order to achieve the above object, the technical solution adopted in the present invention is: a potential customer mining method based on customer behavior characteristics, characterized in that:
步骤一:数据预处理;Step 1: Data preprocessing;
Step1:数据清洗Step1: Data cleaning
原始日志记录累积了大量的客户浏览信息,很多是与数据挖掘无关的冗余信息,比如图片、短信验证、Logo图片等信息,首先需要删除不需要的记录行;The original log records have accumulated a large amount of customer browsing information, many of which are redundant information not related to data mining, such as pictures, SMS verification, Logo pictures and other information. First, you need to delete unnecessary record lines;
Step2:形成URL规则列表Step2: Form a list of URL rules
分析新一站web数据中的REQUEST字段,对包含‘confirm’时,代表意向购买等,最终形成URL规则列表;后续计算特征时,不需要逐个分析request字段,可根据request字段跟url规则列表中的url匹配,获取url_name;Analyze the REQUEST field in the web data of the new station, and when it contains 'confirm', it represents the intention to purchase, etc., and finally forms a list of URL rules; when calculating features later, it is not necessary to analyze the request fields one by one, you can use the request field and the url rule list The url matches, get url_name;
Step3:用户标志Step3: User logo
访问日志表里是以vinfo、iptonumber、客户ID(login_id)联合识别用户;In the access log table, users are jointly identified by vinfo, iptonumber, and customer ID (login_id);
vinfo:相当于cookie,标志着一台计算机;Vinfo: Equivalent to cookie, which marks a computer;
iptonumber:ip地址,同一台计算机在不同地方登陆,会有不同的ip;iptonumber: ip address, the same computer will have different ip when logging in in different places;
login_id:会员登陆id,非会员登陆时login_id=-1。login_id: member login id, login_id=-1 when non-member login.
Step4:特征提取Step4: Feature extraction
以一个session为单位,分析单个session里每个用户的访问来源、浏览页面数、浏览产品详情页数、浏览产品数、页面浏览时长、产品详情页浏览时长、查看筛选列表的次数、是否查看业务话题、用户单日首次浏览时段、用户是否查看购物车等特征属性,并以用户是否意向购买作为类别属性,以此特征最终形成训练样本;Taking a session as a unit, analyze the access source, number of pages browsed, number of product details pages browsed, number of products browsed, page browsing time, product details page browsing time, times of viewing the filter list, and whether to view the business of each user in a single session. Feature attributes such as topic, user's first browsing period in a single day, whether the user views the shopping cart, etc., and whether the user intends to purchase is used as the category attribute, and the training sample is finally formed with this feature;
Step5:筛选训练集Step5: Filter the training set
web日志里的行为数据信息是某个时间段内全体用户在新一站网站上产生的行为信息数据,这其中就包括有多次购物的人(忠诚客户),购物次数不多的人(现有客户),潜在客户以及浏览了网站主页,但没有浏览任何网站内商品的人(纯粹的浏览者)产生的行为数据;The behavior data information in the web log is the behavior information data generated by all users on the new website within a certain period of time, which includes people who have made many purchases (loyal customers) and people who have made few purchases (now There are customers), potential customers and behavioral data generated by people who have browsed the homepage of the website but have not browsed any products on the website (pure browsers);
通过分析一段时间内的购买次数,排除多次购买的客户数据,选取对某一产品进行第一次购买的客户或浏览后未购买的客户作为挖掘对象;By analyzing the number of purchases within a period of time, excluding customer data with multiple purchases, select customers who have purchased a product for the first time or customers who have not purchased after browsing as mining objects;
步骤二:基于粗糙集的特征属性约简;Step 2: Reduction of feature attributes based on rough sets;
对于类别属性,步骤一提取的11个特征属性,某些有可能是冗余的。可以根据粗糙集理论,在不影响分类性能的前提下,将冗余属性约去,从而减少运算量,提高分类精度。For category attributes, some of the 11 feature attributes extracted in step 1 may be redundant. According to the rough set theory, the redundant attributes can be reduced without affecting the classification performance, so as to reduce the amount of calculation and improve the classification accuracy.
方法步骤:Method steps:
首先利用相对正域求核Core:First use the relative positive domain to find the core:
Step1:初始化数据Core=φ,C={a1,a2,...,aj}j=1,2,...11,aj为特征属性,D={a12}为类别属性,计算相对正域POSc(D);Step1: Initialize data Core=φ, C={a 1 ,a 2 ,...,a j }j=1,2,...11, a j is the feature attribute, D={a 12 } is the category attribute , calculate the relative positive domain POS c (D);
Step2:B=C-{aj},计算相对正域POSB(D),并比较POSc(D)、POSB(D)。若POSc(D)≠POSB(D),则aj为核属性,Core=Core∩B,循环判断每个属性是否为核属性。Step2: B=C-{a j }, calculate the relative positive field POS B (D), and compare POS c (D) and POS B (D). If POS c (D)≠POS B (D), then a j is a core attribute, Core=Core∩B, loop to determine whether each attribute is a core attribute.
Step3:返回Core,结束。Step3: Return to Core, end.
其次利用属性依赖度求约简Reduce:Secondly, use attribute dependency to reduce Reduce:
Step1:初始化数据,剩余属性RestAtt=C-Core,Reduce=Core, Step1: Initialize data, remaining attributes RestAtt=C-Core, Reduce=Core,
Step2:比较POScore(D)、POSc(D),若相等,则Core即为约简,否则转到step3;Step2: Compare POS core (D) and POS c (D), if they are equal, then Core is reduction, otherwise go to step3;
Step3:循环RestAtt中每个剩余属性aj,设选出使得K值最大的属性ak,令Reduce=Reduce∪{ak},RestAtt=RestAtt-{ak},并比较POSRestAtt(D)与POSc(D),若相等则转到step4,否则继续循环。Step3: Loop through each remaining attribute a j in RestAtt, set Select the attribute a k that makes the K value the largest, set Reduce=Reduce∪{a k }, RestAtt=RestAtt-{a k }, and compare POS RestAtt (D) and POS c (D), if they are equal, go to step4 , otherwise the loop continues.
Step4:返回Reduce,结束。Step4: Return to Reduce, end.
此时的Reduce即为最终输入分类器的特征属性。添加基于粗糙集的属性筛选这一步。The Reduce at this time is the feature attribute of the final input classifier. Add a rough set based attribute filtering step.
步骤三:基于客户行为特征的随机森林潜在客户识别模型Step 3: Random forest potential customer identification model based on customer behavior characteristics
随机森林算法使用R3.0.2软件的语言软件包randomForest4.6-6来实现,程序通过数据源ODBC连接Oracle数据库,运用函数get_data()获取所需数据,运用函数cal_feture()计算数据特征;筛选训练集后,调用随机森林分类模型model_rf对特征数据进行预测得到潜在客户ip及cookie信息,最后通过ip和cookie在已有数据表中查找潜在客户的用户信息并写入数据库中。The random forest algorithm is implemented using the language package randomForest4.6-6 of the R3.0.2 software. The program connects to the Oracle database through the data source ODBC, uses the function get_data() to obtain the required data, and uses the function cal_feture() to calculate the data features; screening training After the collection, call the random forest classification model model_rf to predict the feature data to obtain the potential customer ip and cookie information, and finally use the ip and cookie to find the potential customer user information in the existing data table and write it into the database.
步骤三中:In step three:
Step1:连接数据库,函数get_data()的功能为从数据库中获取所需数据,参数chan为数据库连接,cal_number为所需获取数据的日期,Step1: Connect to the database. The function of get_data() is to obtain the required data from the database. The parameter chan is the database connection, and cal_number is the date to obtain the data.
data=sqlQuery(chan,sql,stringsAsFactors=FALSE)data=sqlQuery(chan,sql,stringsAsFactors=FALSE)
通过RODBC包中odbcConnect()函数建立R与Oracle数据库连接:Use the odbcConnect() function in the RODBC package to establish a connection between R and the Oracle database:
chan=odbcConnect("dm_xyz",uid='######',pwd='******')chan=odbcConnect("dm_xyz", uid='######', pwd='******')
其中,参数dm_xyz为数据源ODBC的系统DSN名,uid为用户名,pwd为用户登录密码;Among them, the parameter dm_xyz is the system DSN name of the data source ODBC, uid is the user name, and pwd is the user login password;
建立数据库连接之后,通过执行sql语句获取数据库中所需数据;并在sql语句中添加步骤一的数据清洗规则;After establishing the database connection, obtain the required data in the database by executing the sql statement; and add the data cleaning rule of step 1 in the sql statement;
Step2:URL匹配,更新浏览页面信息Step2: URL matching, update browsing page information
由本地读入步骤二的URL规则txt文档,根据数据中的REQUEST字段匹配URL规则的关键字,更新VISIT_PAGE字段,无匹配项的记录则设为“-1”。Read the URL rule txt file in step 2 locally, match the keywords of the URL rule according to the REQUEST field in the data, update the VISIT_PAGE field, and set "-1" for records with no matching items.
Step3:特征计算cal_feature(data)Step3: Feature calculation cal_feature(data)
函数cal_feature的功能为将get_data函数获取的单日数据分割成不同的浏览session,对单个session计算特征最终获得特征数据集。The function of the function cal_feature is to divide the single-day data obtained by the get_data function into different browsing sessions, calculate the features for a single session, and finally obtain the feature data set.
步骤四:潜在客户识别模型性能验证Step 4: Potential Customer Identification Model Performance Verification
利用oracle编辑存储过程,判断挖出的潜在客户在挖出日期后一个月内真正购买的比例,作为模型性能验证指标。如果每天预测出的客户后续购买率比较高且比较平稳,证明模型性能较好。给出了模型效果指标。Use oracle to edit the stored procedure to determine the proportion of potential customers who are dug out and actually buy within one month after the date of dug out, as a model performance verification indicator. If the customer's follow-up purchase rate predicted every day is relatively high and relatively stable, it proves that the model has better performance. Model performance indicators are given.
本发明有益效果:本发明提供了一种基于客户行为特征的潜在客户挖掘方法,使用稳定随机森林算法建立分类模型,具有效率高且数据准确的特点,能够为网站制定相应的服务策略提供准确的参考依据及做出相应决策。Beneficial effects of the present invention: the present invention provides a potential customer mining method based on customer behavior characteristics, uses a stable random forest algorithm to establish a classification model, has the characteristics of high efficiency and accurate data, and can provide accurate information for the website to formulate corresponding service strategies References and make corresponding decisions.
附图说明Description of drawings
图1为整个程序的结构图,其中model_rf.Rdata为已经训练的分类模型。Figure 1 is the structure diagram of the whole program, where model_rf.Rdata is the trained classification model.
具体实施方式detailed description
步骤一:数据预处理;Step 1: Data preprocessing;
Step1:数据清洗Step1: Data cleaning
原始日志记录累积了大量的客户浏览信息,很多是与数据挖掘无关的冗余信息,比如图片、短信验证、Logo图片等信息,首先需要删除不需要的记录行。The original log records have accumulated a large amount of customer browsing information, many of which are redundant information not related to data mining, such as pictures, SMS verification, Logo pictures and other information. First, unnecessary record lines need to be deleted.
需要删除的记录行的REQUEST如下The REQUEST of the row to be deleted is as follows
Step2:形成本申请人新一站URL规则列表Step2: Form the applicant's new one-stop URL rule list
分析本申请人新一站web数据中的REQUEST字段,比如分析request包含‘baoxian’时,代表查看保险话题,包含‘confirm’时,代表意向购买等,最终形成新一站URL规则列表。后续计算特征时,不需要逐个分析request字段,可根据request字段跟url规则列表中的url匹配,获取url_name,如表2所示。Analyze the REQUEST field in the applicant’s new web data, for example, analyze that when the request contains ‘baoxian’, it means viewing insurance topics, and when it includes ‘confirm’, it means that it intends to purchase, etc., and finally forms a list of new URL rules. In the subsequent calculation of features, it is not necessary to analyze the request fields one by one. The url_name can be obtained by matching the request field with the url in the url rule list, as shown in Table 2.
表2新一站url规则Table 2 New one-stop url rules
Step3:用户标志Step3: User logo
新一站访问日志表里是以vinfo、iptonumber、客户ID(login_id)联合识别用户。The access log table of the new site uses vinfo, iptonumber, and customer ID (login_id) to jointly identify users.
vinfo:相当于cookie,标志着一台计算机;Vinfo: Equivalent to cookie, which marks a computer;
iptonumber:ip地址,同一台计算机在不同地方登陆,会有不同的ip;iptonumber: ip address, the same computer will have different ip when logging in in different places;
login_id:新一站会员登陆id,非会员登陆时login_id=-1。login_id: login id of a member of the new station, login_id=-1 when a non-member logs in.
Step4:特征提取Step4: Feature extraction
以一个session为单位,分析单个session里每个用户的访问来源、浏览页面数、浏览产品详情页数、浏览产品数、页面浏览时长、产品详情页浏览时长、查看筛选列表的次数、是否查看保险话题、是否查看保险字典、用户单日首次浏览时段、用户是否查看购物车等特征属性,并以用户是否意向购买作为类别属性,最终形成训练样本。Taking a session as a unit, analyze the access source, number of pages viewed, number of product details pages viewed, number of products viewed, page browsing time, product details page browsing time, times of viewing the filter list, and whether to view insurance for each user in a single session. Feature attributes such as topic, whether to check the insurance dictionary, the time period of the user's first browsing in a single day, whether the user checks the shopping cart, etc., and whether the user intends to purchase is used as the category attribute to finally form a training sample.
a)特征1:访问来源,根据每一个session的第一个referer,区分客户的访问来源。广告投放、搜索引擎、邮件、直接访问、其他。a) Feature 1: Access source, according to the first referer of each session, distinguish the client's access source. Advertising, search engines, mail, direct access, others.
b)特征2:浏览页面数,取一个session内用户浏览的页面数。b) Feature 2: The number of pages browsed, the number of pages browsed by the user in a session.
c)特征3:浏览产品详情页数,取一个session内用户浏览产品详情页数。c) Feature 3: The number of product details pages browsed, the number of product details pages browsed by users in a session.
d)特征4:页面浏览时长,一个session内用户每个页面的浏览时长取平均值。d) Feature 4: page browsing time, the average browsing time of each page of a user in a session.
e)特征5:产品详情页浏览时长,一个session内用户每个产品详情页面的浏览时长取平均值。e) Feature 5: Browsing time of the product details page, the average browsing time of each product details page of the user in a session.
f)特征6:查看筛选列表的次数,观察一个session内ruquest有’viewsearchlist’的次数。f) Feature 6: View the number of times to filter the list, and observe the number of times that ruquest has 'viewsearchlist' in a session.
g)特征7:是否查看保险话题。观察ruquest里是否有‘baoxian’。g) Feature 7: Whether to view insurance topics. Check if there is 'baoxian' in ruquest.
h)特征8:是否查看保险字典。观察ruquest里是否有‘toptag’。h) Feature 8: Whether to check the insurance dictionary. Check if there is 'toptag' in ruquest.
i)特征9:浏览时段。一个session首次访问时间visit_timei) Feature 9: Browsing period. A session first visit time visit_time
j)特征10:是否查看购物车,是为‘1’,否为‘0’;是否查看购物车是观察ruquest里有没有shopping_car。j) Feature 10: Yes No to check the shopping cart, yes is '1', no is '0'; whether to check the shopping cart is to observe whether there is shopping_car in ruquest.
k)类别属性:是否意向购买,是为‘1’,否为‘0’;是否意向购买是判断ruquest里有没有confirm,含有confirm说明产生意向购买行为。k) Category attribute: whether it is intended to purchase, it is '1', and if it is not, it is '0'; whether it is intended to purchase is to judge whether there is a confirm in the ruquest, and the presence of confirm indicates the intention to purchase.
Step5:筛选训练集Step5: Filter the training set
web日志里的行为数据信息是某个时间段内全体用户在新一站网站上产生的行为信息数据,这其中就包括有多次购物的人(忠诚客户),购物次数不多的人(现有客户),潜在客户以及浏览了网站主页,但没有浏览任何网站内商品的人(纯粹的浏览者)产生的行为数据。The behavior data information in the web log is the behavior information data generated by all users on the new website within a certain period of time, which includes people who have made many purchases (loyal customers) and people who have made few purchases (now There are customers), potential customers, and people who have browsed the homepage of the website but have not browsed any products on the website (pure viewers).
若将所有数据直接用来训练分类模型会使得结果产生较大的误差。为避免造成误差,应该排除掉其他类型客户行为对构建模型准确性的干扰。If all the data is directly used to train the classification model, the result will have a large error. In order to avoid errors, the interference of other types of customer behaviors on the accuracy of the model should be excluded.
通过分析一段时间内的购买次数,排除掉多次购买的客户数据,本发明选取对某一产品进行第一次购买的客户或浏览后未购买的客户作为挖掘对象。By analyzing the number of purchases within a period of time and excluding customer data of multiple purchases, the present invention selects customers who purchase a certain product for the first time or customers who have not purchased after browsing as mining objects.
Step6:筛选特征属性Step6: Filter feature attributes
通过粗糙集理论,排除冗余属性,最终得到的特征属性有访问来源、浏览页面数、浏览产品详情页数、浏览产品数、页面浏览时长、产品详情页浏览时长、搜索次数、是否查看保险话题、是否查看保险字典、用户是否查看购物车。Through rough set theory, redundant attributes are eliminated, and the final characteristic attributes include access source, number of pages viewed, number of product details pages viewed, number of products viewed, page browsing time, product details page browsing time, number of searches, whether to view insurance topics , Whether to check the insurance dictionary, whether the user checks the shopping cart.
步骤三:基于客户行为特征的随机森林潜在客户识别模型Step 3: Random forest potential customer identification model based on customer behavior characteristics
随机森林算法使用R3.0.2软件的语言软件包randomForest4.6-6来实现,程序通过数据源(ODBC)连接Oracle数据库,运用函数get_data()获取所需数据,运用函数cal_feture()计算数据特征。筛选训练集后,调用随机森林分类模型model_rf对特征数据进行预测得到潜在客户ip及cookie信息,最后通过ip和cookie在已有数据表中查找潜在客户的用户信息并写入数据库中。The random forest algorithm is implemented using the language package randomForest4.6-6 of the R3.0.2 software. The program connects to the Oracle database through the data source (ODBC), uses the function get_data() to obtain the required data, and uses the function cal_feture() to calculate the data features. After screening the training set, call the random forest classification model model_rf to predict the feature data to obtain potential customer ip and cookie information, and finally find potential customer user information in the existing data table through ip and cookie and write it into the database.
Step1:连接数据库Step1: Connect to the database
函数get_data()的功能为从数据库中获取所需数据,参数chan为数据库连接,cal_number为所需获取数据的日期。The function of the function get_data() is to obtain the required data from the database, the parameter chan is the database connection, and the cal_number is the date to obtain the data.
data=sqlQuery(chan,sql,stringsAsFactors=FALSE)data=sqlQuery(chan,sql,stringsAsFactors=FALSE)
通过RODBC包中odbcConnect()函数建立R与Oracle数据库连接:Use the odbcConnect() function in the RODBC package to establish a connection between R and the Oracle database:
chan=odbcConnect("dm_xyz",uid='######',pwd='******')chan=odbcConnect("dm_xyz", uid='######', pwd='******')
其中,参数dm_xyz为数据源(ODBC)的系统DSN名,uid为用户名,pwd为用户登录密码。Among them, the parameter dm_xyz is the system DSN name of the data source (ODBC), uid is the user name, and pwd is the user login password.
建立数据库连接之后,通过执行sql语句获取数据库中所需数据。并在sql语句中添加步骤一的数据清洗规则。After the database connection is established, the required data in the database is obtained by executing the sql statement. And add the data cleaning rule of step 1 in the sql statement.
Step2:URL匹配,更新浏览页面信息Step2: URL matching, update browsing page information
由本地读入步骤二的URL规则txt文档,根据数据中的REQUEST字段匹配URL规则的关键字,更新VISIT_PAGE字段,无匹配项的记录则设为“-1”。Read the URL rule txt document in step 2 from the local, match the keywords of the URL rule according to the REQUEST field in the data, update the VISIT_PAGE field, and set "-1" for records with no matching items.
Step3:特征计算cal_feature(data)Step3: Feature calculation cal_feature(data)
函数cal_feature的功能为将get_data函数获取的单日数据分割成不同的浏览session,对单个session计算特征最终获得特征数据集。The function of the function cal_feature is to divide the single-day data obtained by the get_data function into different browsing sessions, calculate the features for a single session, and finally obtain the feature data set.
(1)cookie信息(1) cookie information
(2)是否登录(2) Whether to log in
(3)浏览页面数和页面平均浏览时长(3) Number of pages viewed and average page browsing time
#浏览页面数#Number of browsed pages
page_tag=which(tmp$VISIT_PAGE!='-1')page_tag=which(tmp$VISIT_PAGE !='-1')
page_n=length(page_tag)page_n=length(page_tag)
#产品详情页页数#Number of product details pages
prod_tag=which(tmp$VISIT_PAGE=='产品详情页')prod_tag=which(tmp$VISIT_PAGE=='Product Details Page')
page_prod=length(prod_tag)page_prod=length(prod_tag)
#页面和产品详情页的浏览时长(page_n-1)#Browsing time of pages and product details pages (page_n-1)
time=strptime(tmp[page_tag,2],"%Y-%m-%d%H:%M:%S")#实际页面的请求时间time=strptime(tmp[page_tag,2],"%Y-%m-%d%H:%M:%S")#Request time of the actual page
stay_time=as.vector(diff(time))#页面请求时间差作为页面的停留时间stay_time=as.vector(diff(time))#page request time difference as the stay time of the page
page_time_avg=sum(stay_time,na.rm=T)/page_n#页面平均访问时长page_time_avg=sum(stay_time,na.rm=T)/page_n#Average page access time
prod_time=stay_time[which(page_tag%in%prod_tag)]#产品详情页总停留时间prod_time=stay_time[which(page_tag%in%prod_tag)]#The total stay time of the product details page
prod_time_avg=sum(prod_time,na.rm=T)/page_prod#产品详情页平均时长prod_time_avg=sum(prod_time,na.rm=T)/page_prod#Average duration of product details page
(4)查看筛选列表的次数(4) Number of times the filter list was viewed
#筛选列表次数(0表示未搜索)#Filter list times (0 means no search)
sear_n=length(which(tmp$VISIT_PAGE=="搜索列表"))sear_n=length(which(tmp$VISIT_PAGE=="search list"))
(5)是否查看过购物车(5) Have you checked the shopping cart
(6)是否查看过保险话题(6) Have you checked the topic of insurance?
(7)是否查看过保险字典(7) Have you checked the insurance dictionary
(8)来源网站的类型,根据REFERER和REFERER_SOURCE_WORD字段关键字判断来源网站。(8) The type of the source website, judge the source website according to the keywords in the REFERER and REFERER_SOURCE_WORD fields.
(9)是否意向购买(预测变量)(9) Intention to purchase (predictor variable)
最终合成一条obs观测记录返回,筛选首次购买和浏览后未购买的用户,形成训练样本。Finally, an obs observation record is synthesized and returned, and users who purchase for the first time and users who have not purchased after browsing are screened to form a training sample.
Step4:更新数据库中潜在客户信息表数据update_info(channe,cal_number)Step4: Update the potential customer information table data update_info(channe,cal_number) in the database
该函数为整个程序最主要的函数,用于预测潜在客户、获取潜在客户信息并写入数据库。函数中会调用get_data()函数和cal_feature()函数.This function is the most important function of the whole program, which is used to predict potential customers, obtain potential customer information and write it into the database. The get_data() function and cal_feature() function will be called in the function.
(1)获取数据、计算特征及预测潜在客户(1) Obtain data, calculate features and predict potential customers
data=get_data(channel,cal_number)#获取cal_number对应日期的数据data=get_data(channel,cal_number)#Get the data of the date corresponding to cal_number
feature=cal_feature(data)#计算该日数据对应的特征数据feature=cal_feature(data)#Calculate the feature data corresponding to the daily data
feature[,14]=as.factor(feature[,14])#将buy_flag字段设置为因子feature[,14]=as.factor(feature[,14])#Set the buy_flag field to factor
feature0=feature[which(feature$buy_flag==0),]#提取标记为0的记录作为被预测对象feature0=feature[which(feature$buy_flag==0),]#Extract the record marked as 0 as the predicted object
pre_rf=predict(model_rf,feature0[,3:14])#用模型进行预测pre_rf=predict(model_rf,feature0[,3:14])#Use the model to predict
potential_ip=feature0[which(pre_rf==1),1:2]#取标记为1的ip和cookie作为潜在客户potential_ip=feature0[which(pre_rf==1),1:2]#Take the ip and cookie marked as 1 as a potential customer
(2)通过login_id,关联user_info表,更新用户基本信息(2) Through the login_id, associate the user_info table and update the basic information of the user
基本信息包括用户的年龄、性别、邮箱、生日等信息,以便网站做一些线下的推广策略。The basic information includes the user's age, gender, email address, birthday and other information, so that the website can do some offline promotion strategies.
Step5:启动程序自动识别潜在客户Step5: Start the program to automatically identify potential customers
执行整个程序时只需执行update_info(channe,cal_number)函数即可,在执行函数前获取系统日期,将系统日期推前一天作为程序执行时的cal_number参数。When executing the whole program, you only need to execute the update_info(channe, cal_number) function, get the system date before executing the function, and push the system date forward one day as the cal_number parameter when the program is executed.
Step6:模型性能验证Step6: Model performance verification
最终挖出的潜在客户信息表通过vinfo关联新一站日志表,得到每个潜在客户挖出日期后30天的访问行为。若该客户在后续30天内具有付款成功标志,即request含有‘paysuccess’,并且正则表达式截取获得的新一站日志表中的订单ID在订单信息表里付款状态为已经付款,此时认为该潜在客户后续确实产生购买行为,证明模型成功预测。因此以后续购买率作为模型性能验证指标。The finally dug out potential customer information table is associated with the new station log table through vinfo, and the visit behavior of each potential customer 30 days after the digging date is obtained. If the customer has a payment success flag within the next 30 days, that is, the request contains 'paysuccess', and the order ID in the log table of the new station intercepted by the regular expression is paid in the order information table, the payment status is considered to be paid at this time. Potential customers did subsequently purchase behavior, proving that the model successfully predicted. Therefore, the follow-up purchase rate is used as the model performance verification index.
应当指出,在不脱离本发明原理的前提下,作适当修改或者替换,这些修改或者替换也应视为本发明的保护范围。It should be pointed out that, without departing from the principle of the present invention, appropriate modifications or substitutions should be made, and these modifications or substitutions should also be regarded as the protection scope of the present invention.
Claims (3)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510903856.4A CN105488697A (en) | 2015-12-09 | 2015-12-09 | Potential customer mining method based on customer behavior characteristics |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510903856.4A CN105488697A (en) | 2015-12-09 | 2015-12-09 | Potential customer mining method based on customer behavior characteristics |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN105488697A true CN105488697A (en) | 2016-04-13 |
Family
ID=55675664
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201510903856.4A Pending CN105488697A (en) | 2015-12-09 | 2015-12-09 | Potential customer mining method based on customer behavior characteristics |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN105488697A (en) |
Cited By (48)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106250481A (en) * | 2016-07-29 | 2016-12-21 | 深圳市永兴元科技有限公司 | Data digging methods based on big data and device |
| CN106294587A (en) * | 2016-07-28 | 2017-01-04 | 焦点科技股份有限公司 | Special topic module drainage effect methods of exhibiting in the website of a kind of Rapid Implementation |
| CN106294823A (en) * | 2016-08-17 | 2017-01-04 | 上海云信留客信息科技有限公司 | Abnormality detection and the method for elimination for big data cleansing |
| CN106294778A (en) * | 2016-08-11 | 2017-01-04 | 北京小米移动软件有限公司 | Information-pushing method and device |
| CN106375431A (en) * | 2016-08-31 | 2017-02-01 | 北京城市网邻信息技术有限公司 | Business opportunity recommendation method and device |
| CN106713034A (en) * | 2016-12-23 | 2017-05-24 | 广州帷策智能科技有限公司 | Wechat public account making user group activation monitoring method and apparatus |
| CN106991175A (en) * | 2017-04-06 | 2017-07-28 | 百度在线网络技术(北京)有限公司 | A kind of customer information method for digging, device, equipment and storage medium |
| CN107516236A (en) * | 2017-07-22 | 2017-12-26 | 长沙兔子代跑网络科技有限公司 | A kind of method and device that generation race client is excavated according to user behavior data |
| CN107526778A (en) * | 2017-07-22 | 2017-12-29 | 长沙兔子代跑网络科技有限公司 | A kind of method and device that generation race client is excavated according to user behavior data |
| CN107590688A (en) * | 2017-08-24 | 2018-01-16 | 平安科技(深圳)有限公司 | The recognition methods of target customer and terminal device |
| CN107679889A (en) * | 2017-09-08 | 2018-02-09 | 平安科技(深圳)有限公司 | The recognition methods of potential customers a kind of and terminal device |
| CN107862556A (en) * | 2017-12-04 | 2018-03-30 | 北京奇艺世纪科技有限公司 | A kind of put-on method and system of VIP advertisements |
| CN107886382A (en) * | 2016-09-29 | 2018-04-06 | 北京京东尚科信息技术有限公司 | The method, apparatus and system of channel drainage effect in analyzing web site station |
| CN107944913A (en) * | 2017-11-21 | 2018-04-20 | 重庆邮电大学 | High potential user's purchase intention Forecasting Methodology based on big data user behavior analysis |
| CN108053263A (en) * | 2017-12-28 | 2018-05-18 | 北京金堤科技有限公司 | The method and device of potential user's data mining |
| CN108256907A (en) * | 2018-01-09 | 2018-07-06 | 北京腾云天下科技有限公司 | A kind of construction method and computing device of customer grouping model |
| CN108683949A (en) * | 2018-05-18 | 2018-10-19 | 北京奇艺世纪科技有限公司 | A kind of extracting method and device of live streaming platform potential user |
| WO2018191918A1 (en) * | 2017-04-20 | 2018-10-25 | Beijing Didi Infinity Technology And Development Co., Ltd. | System and method for learning-based group tagging |
| CN108932625A (en) * | 2017-05-23 | 2018-12-04 | 北京京东尚科信息技术有限公司 | Analysis method, device, medium and the electronic equipment of user behavior data |
| CN109191169A (en) * | 2018-07-19 | 2019-01-11 | 国政通科技有限公司 | Precisely hit the method for high-end tourism potential user |
| CN109255638A (en) * | 2017-07-13 | 2019-01-22 | 北京融和友信科技股份有限公司 | A kind of mathematical model for excavating potential customers |
| CN109558396A (en) * | 2018-10-24 | 2019-04-02 | 深圳市万屏时代科技有限公司 | A kind of user demand data cleaning method and system |
| CN109977977A (en) * | 2017-12-28 | 2019-07-05 | 中移信息技术有限公司 | A kind of method and corresponding intrument identifying potential user |
| CN109983490A (en) * | 2016-10-06 | 2019-07-05 | 邓白氏公司 | The Machine learning classifiers and prediction engine that the potential customers of artificial intelligence optimization determine are carried out on winning/losing classification |
| CN110222272A (en) * | 2019-04-18 | 2019-09-10 | 广东工业大学 | A kind of potential customers excavate and recommended method |
| CN110232589A (en) * | 2019-05-16 | 2019-09-13 | 浙江华坤道威数据科技有限公司 | A kind of intention customer analysis system based on big data |
| CN110476159A (en) * | 2017-03-30 | 2019-11-19 | 日本电气株式会社 | Information processing system, characteristic value illustration method and characteristic value read-me |
| CN110637316A (en) * | 2016-12-22 | 2019-12-31 | 奥恩全球运营有限公司,新加坡分公司 | System and method for intelligent prospective object recognition using online resources and neural network processing to classify tissue based on published material |
| CN111091282A (en) * | 2019-12-10 | 2020-05-01 | 焦点科技股份有限公司 | Customer loyalty segmentation method based on user behavior data |
| CN111292194A (en) * | 2018-12-06 | 2020-06-16 | 泰康保险集团股份有限公司 | Online insurance client data processing method, device, medium and electronic equipment |
| CN111611514A (en) * | 2020-04-11 | 2020-09-01 | 上海淇玥信息技术有限公司 | Page display method and device based on user login information and electronic equipment |
| US10769159B2 (en) | 2016-12-22 | 2020-09-08 | Aon Global Operations Plc, Singapore Branch | Systems and methods for data mining of historic electronic communication exchanges to identify relationships, patterns, and correlations to deal outcomes |
| CN111681051A (en) * | 2020-06-08 | 2020-09-18 | 上海汽车集团股份有限公司 | Purchase intention prediction method, device, storage medium and terminal |
| CN111754253A (en) * | 2019-06-20 | 2020-10-09 | 北京沃东天骏信息技术有限公司 | User authentication method, device, computer equipment and storage medium |
| CN111914187A (en) * | 2020-07-23 | 2020-11-10 | 向杰 | Method for recommending commodities and tracking recommending relation chain |
| CN112070519A (en) * | 2019-06-11 | 2020-12-11 | 中国科学院沈阳自动化研究所 | A prediction method based on data global search and feature classification |
| US10951695B2 (en) | 2019-02-14 | 2021-03-16 | Aon Global Operations Se Singapore Branch | System and methods for identification of peer entities |
| CN112598007A (en) * | 2021-03-04 | 2021-04-02 | 浙江所托瑞安科技集团有限公司 | Method, device and equipment for screening picture training set and readable storage medium |
| CN112667911A (en) * | 2021-01-14 | 2021-04-16 | 中山世达模型制造有限公司 | Method for searching potential customers by using social software big data |
| TWI735932B (en) * | 2019-08-21 | 2021-08-11 | 崑山科技大學 | Real estate potential customer forecasting system and method thereof |
| CN113538025A (en) * | 2020-04-14 | 2021-10-22 | 中国移动通信集团浙江有限公司 | Replacement prediction method and device for terminal equipment |
| CN113554460A (en) * | 2021-07-19 | 2021-10-26 | 北京沃东天骏信息技术有限公司 | Method and device for identifying potential user |
| CN113934616A (en) * | 2021-12-16 | 2022-01-14 | 深圳市活力天汇科技股份有限公司 | Method for judging abnormal user based on user operation time sequence |
| CN114519608A (en) * | 2022-02-15 | 2022-05-20 | 平安证券股份有限公司 | Business opportunity extraction method, device, medium and electronic equipment |
| CN116418881A (en) * | 2023-04-18 | 2023-07-11 | 吉林省禹语网络科技有限公司 | Data intelligent processing method for E-commerce big data cloud edge cooperative transmission |
| CN116523600A (en) * | 2023-05-05 | 2023-08-01 | 佛山市大迈信息科技有限公司 | Customer classification method and system based on behavior analysis |
| CN116523572A (en) * | 2023-06-28 | 2023-08-01 | 悦享星光(北京)科技有限公司 | Client mining method and system based on client behavior characteristics |
| US12361389B2 (en) | 2022-10-26 | 2025-07-15 | Volvo Car Corporation | Vehicle sharing service optimization |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6839680B1 (en) * | 1999-09-30 | 2005-01-04 | Fujitsu Limited | Internet profiling |
| CN102542335A (en) * | 2011-06-16 | 2012-07-04 | 广州市龙泰信息技术有限公司 | Mixed data mining method |
| CN104142960A (en) * | 2013-05-10 | 2014-11-12 | 上海普华诚信信息技术有限公司 | Internet data analysis system |
| CN105069654A (en) * | 2015-08-07 | 2015-11-18 | 新一站保险代理有限公司 | User identification based website real-time/non-real-time marketing investment method and system |
-
2015
- 2015-12-09 CN CN201510903856.4A patent/CN105488697A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6839680B1 (en) * | 1999-09-30 | 2005-01-04 | Fujitsu Limited | Internet profiling |
| CN102542335A (en) * | 2011-06-16 | 2012-07-04 | 广州市龙泰信息技术有限公司 | Mixed data mining method |
| CN104142960A (en) * | 2013-05-10 | 2014-11-12 | 上海普华诚信信息技术有限公司 | Internet data analysis system |
| CN105069654A (en) * | 2015-08-07 | 2015-11-18 | 新一站保险代理有限公司 | User identification based website real-time/non-real-time marketing investment method and system |
Cited By (75)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106294587A (en) * | 2016-07-28 | 2017-01-04 | 焦点科技股份有限公司 | Special topic module drainage effect methods of exhibiting in the website of a kind of Rapid Implementation |
| CN106294587B (en) * | 2016-07-28 | 2019-05-10 | 焦点科技股份有限公司 | Thematic module drainage effect methods of exhibiting in a kind of website of Rapid Implementation |
| CN106250481A (en) * | 2016-07-29 | 2016-12-21 | 深圳市永兴元科技有限公司 | Data digging methods based on big data and device |
| CN106294778B (en) * | 2016-08-11 | 2019-09-10 | 北京小米移动软件有限公司 | Information-pushing method and device |
| CN106294778A (en) * | 2016-08-11 | 2017-01-04 | 北京小米移动软件有限公司 | Information-pushing method and device |
| CN106294823A (en) * | 2016-08-17 | 2017-01-04 | 上海云信留客信息科技有限公司 | Abnormality detection and the method for elimination for big data cleansing |
| CN106294823B (en) * | 2016-08-17 | 2019-03-22 | 上海云信留客信息科技有限公司 | The method of abnormality detection and elimination for big data cleaning |
| CN106375431B (en) * | 2016-08-31 | 2019-12-31 | 北京城市网邻信息技术有限公司 | Business opportunity recommendation method and device |
| CN106375431A (en) * | 2016-08-31 | 2017-02-01 | 北京城市网邻信息技术有限公司 | Business opportunity recommendation method and device |
| CN107886382B (en) * | 2016-09-29 | 2021-11-30 | 北京京东尚科信息技术有限公司 | Method, device and system for analyzing channel drainage effect in website |
| CN107886382A (en) * | 2016-09-29 | 2018-04-06 | 北京京东尚科信息技术有限公司 | The method, apparatus and system of channel drainage effect in analyzing web site station |
| CN109983490A (en) * | 2016-10-06 | 2019-07-05 | 邓白氏公司 | The Machine learning classifiers and prediction engine that the potential customers of artificial intelligence optimization determine are carried out on winning/losing classification |
| CN109983490B (en) * | 2016-10-06 | 2023-08-29 | 邓白氏公司 | Machine learning classifier and prediction engine for potential customer determination of artificial intelligence optimization on winning/losing classifications |
| CN110637316A (en) * | 2016-12-22 | 2019-12-31 | 奥恩全球运营有限公司,新加坡分公司 | System and method for intelligent prospective object recognition using online resources and neural network processing to classify tissue based on published material |
| US11455313B2 (en) | 2016-12-22 | 2022-09-27 | Aon Global Operations Se, Singapore Branch | Systems and methods for intelligent prospect identification using online resources and neural network processing to classify organizations based on published materials |
| CN110637316B (en) * | 2016-12-22 | 2021-04-13 | 奥恩全球运营有限公司,新加坡分公司 | System and method for prospective object identification |
| US10769159B2 (en) | 2016-12-22 | 2020-09-08 | Aon Global Operations Plc, Singapore Branch | Systems and methods for data mining of historic electronic communication exchanges to identify relationships, patterns, and correlations to deal outcomes |
| CN106713034A (en) * | 2016-12-23 | 2017-05-24 | 广州帷策智能科技有限公司 | Wechat public account making user group activation monitoring method and apparatus |
| US11727203B2 (en) | 2017-03-30 | 2023-08-15 | Dotdata, Inc. | Information processing system, feature description method and feature description program |
| CN110476159A (en) * | 2017-03-30 | 2019-11-19 | 日本电气株式会社 | Information processing system, characteristic value illustration method and characteristic value read-me |
| CN106991175A (en) * | 2017-04-06 | 2017-07-28 | 百度在线网络技术(北京)有限公司 | A kind of customer information method for digging, device, equipment and storage medium |
| CN106991175B (en) * | 2017-04-06 | 2020-08-11 | 百度在线网络技术(北京)有限公司 | Customer information mining method, device, equipment and storage medium |
| WO2018191918A1 (en) * | 2017-04-20 | 2018-10-25 | Beijing Didi Infinity Technology And Development Co., Ltd. | System and method for learning-based group tagging |
| CN109690571A (en) * | 2017-04-20 | 2019-04-26 | 北京嘀嘀无限科技发展有限公司 | Learning-based group tagging system and method |
| CN109690571B (en) * | 2017-04-20 | 2020-09-18 | 北京嘀嘀无限科技发展有限公司 | Learning-based group labeling system and method |
| CN108932625A (en) * | 2017-05-23 | 2018-12-04 | 北京京东尚科信息技术有限公司 | Analysis method, device, medium and the electronic equipment of user behavior data |
| CN108932625B (en) * | 2017-05-23 | 2022-04-26 | 北京京东尚科信息技术有限公司 | User behavior data analysis method, device, medium and electronic equipment |
| CN109255638A (en) * | 2017-07-13 | 2019-01-22 | 北京融和友信科技股份有限公司 | A kind of mathematical model for excavating potential customers |
| CN109255638B (en) * | 2017-07-13 | 2022-04-26 | 北京融和友信科技股份有限公司 | Mathematical model for mining potential customers |
| CN107516236A (en) * | 2017-07-22 | 2017-12-26 | 长沙兔子代跑网络科技有限公司 | A kind of method and device that generation race client is excavated according to user behavior data |
| CN107526778A (en) * | 2017-07-22 | 2017-12-29 | 长沙兔子代跑网络科技有限公司 | A kind of method and device that generation race client is excavated according to user behavior data |
| WO2019037202A1 (en) * | 2017-08-24 | 2019-02-28 | 平安科技(深圳)有限公司 | Method and apparatus for recognising target customer, electronic device and medium |
| CN107590688A (en) * | 2017-08-24 | 2018-01-16 | 平安科技(深圳)有限公司 | The recognition methods of target customer and terminal device |
| CN107679889A (en) * | 2017-09-08 | 2018-02-09 | 平安科技(深圳)有限公司 | The recognition methods of potential customers a kind of and terminal device |
| CN107944913A (en) * | 2017-11-21 | 2018-04-20 | 重庆邮电大学 | High potential user's purchase intention Forecasting Methodology based on big data user behavior analysis |
| CN107862556A (en) * | 2017-12-04 | 2018-03-30 | 北京奇艺世纪科技有限公司 | A kind of put-on method and system of VIP advertisements |
| CN108053263A (en) * | 2017-12-28 | 2018-05-18 | 北京金堤科技有限公司 | The method and device of potential user's data mining |
| CN109977977A (en) * | 2017-12-28 | 2019-07-05 | 中移信息技术有限公司 | A kind of method and corresponding intrument identifying potential user |
| CN108256907A (en) * | 2018-01-09 | 2018-07-06 | 北京腾云天下科技有限公司 | A kind of construction method and computing device of customer grouping model |
| CN108683949A (en) * | 2018-05-18 | 2018-10-19 | 北京奇艺世纪科技有限公司 | A kind of extracting method and device of live streaming platform potential user |
| CN109191169A (en) * | 2018-07-19 | 2019-01-11 | 国政通科技有限公司 | Precisely hit the method for high-end tourism potential user |
| CN109558396A (en) * | 2018-10-24 | 2019-04-02 | 深圳市万屏时代科技有限公司 | A kind of user demand data cleaning method and system |
| CN111292194A (en) * | 2018-12-06 | 2020-06-16 | 泰康保险集团股份有限公司 | Online insurance client data processing method, device, medium and electronic equipment |
| CN111292194B (en) * | 2018-12-06 | 2023-08-22 | 泰康保险集团股份有限公司 | Online application client data processing method and device, medium and electronic equipment |
| US10951695B2 (en) | 2019-02-14 | 2021-03-16 | Aon Global Operations Se Singapore Branch | System and methods for identification of peer entities |
| CN110222272A (en) * | 2019-04-18 | 2019-09-10 | 广东工业大学 | A kind of potential customers excavate and recommended method |
| CN110232589A (en) * | 2019-05-16 | 2019-09-13 | 浙江华坤道威数据科技有限公司 | A kind of intention customer analysis system based on big data |
| CN112070519B (en) * | 2019-06-11 | 2024-03-05 | 中国科学院沈阳自动化研究所 | Prediction method based on data global search and feature classification |
| CN112070519A (en) * | 2019-06-11 | 2020-12-11 | 中国科学院沈阳自动化研究所 | A prediction method based on data global search and feature classification |
| CN111754253A (en) * | 2019-06-20 | 2020-10-09 | 北京沃东天骏信息技术有限公司 | User authentication method, device, computer equipment and storage medium |
| TWI735932B (en) * | 2019-08-21 | 2021-08-11 | 崑山科技大學 | Real estate potential customer forecasting system and method thereof |
| CN111091282A (en) * | 2019-12-10 | 2020-05-01 | 焦点科技股份有限公司 | Customer loyalty segmentation method based on user behavior data |
| CN111091282B (en) * | 2019-12-10 | 2022-07-22 | 焦点科技股份有限公司 | A Customer Loyalty Segmentation Method Based on User Behavior Data |
| CN111611514B (en) * | 2020-04-11 | 2024-04-23 | 上海淇玥信息技术有限公司 | Page display method and device based on user login information and electronic equipment |
| CN111611514A (en) * | 2020-04-11 | 2020-09-01 | 上海淇玥信息技术有限公司 | Page display method and device based on user login information and electronic equipment |
| CN113538025B (en) * | 2020-04-14 | 2024-03-22 | 中国移动通信集团浙江有限公司 | Replacement prediction method and device for terminal equipment |
| CN113538025A (en) * | 2020-04-14 | 2021-10-22 | 中国移动通信集团浙江有限公司 | Replacement prediction method and device for terminal equipment |
| CN111681051A (en) * | 2020-06-08 | 2020-09-18 | 上海汽车集团股份有限公司 | Purchase intention prediction method, device, storage medium and terminal |
| CN111681051B (en) * | 2020-06-08 | 2023-09-26 | 上海汽车集团股份有限公司 | Purchase intention prediction method, device, storage medium and terminal |
| CN111914187B (en) * | 2020-07-23 | 2023-09-08 | 向杰 | Commodity recommendation and recommendation relation chain tracking method |
| CN111914187A (en) * | 2020-07-23 | 2020-11-10 | 向杰 | Method for recommending commodities and tracking recommending relation chain |
| CN112667911A (en) * | 2021-01-14 | 2021-04-16 | 中山世达模型制造有限公司 | Method for searching potential customers by using social software big data |
| CN112598007B (en) * | 2021-03-04 | 2021-05-18 | 浙江所托瑞安科技集团有限公司 | Method, device and equipment for screening picture training set and readable storage medium |
| CN112598007A (en) * | 2021-03-04 | 2021-04-02 | 浙江所托瑞安科技集团有限公司 | Method, device and equipment for screening picture training set and readable storage medium |
| CN113554460A (en) * | 2021-07-19 | 2021-10-26 | 北京沃东天骏信息技术有限公司 | Method and device for identifying potential user |
| CN113554460B (en) * | 2021-07-19 | 2024-10-22 | 北京沃东天骏信息技术有限公司 | Potential user identification method and device |
| CN113934616B (en) * | 2021-12-16 | 2022-03-18 | 深圳市活力天汇科技股份有限公司 | Method for judging abnormal user based on user operation time sequence |
| CN113934616A (en) * | 2021-12-16 | 2022-01-14 | 深圳市活力天汇科技股份有限公司 | Method for judging abnormal user based on user operation time sequence |
| CN114519608A (en) * | 2022-02-15 | 2022-05-20 | 平安证券股份有限公司 | Business opportunity extraction method, device, medium and electronic equipment |
| US12361389B2 (en) | 2022-10-26 | 2025-07-15 | Volvo Car Corporation | Vehicle sharing service optimization |
| CN116418881A (en) * | 2023-04-18 | 2023-07-11 | 吉林省禹语网络科技有限公司 | Data intelligent processing method for E-commerce big data cloud edge cooperative transmission |
| CN116418881B (en) * | 2023-04-18 | 2024-06-04 | 湖南供销电子商务股份有限公司 | Data intelligent processing method for E-commerce big data cloud edge cooperative transmission |
| CN116523600A (en) * | 2023-05-05 | 2023-08-01 | 佛山市大迈信息科技有限公司 | Customer classification method and system based on behavior analysis |
| CN116523572A (en) * | 2023-06-28 | 2023-08-01 | 悦享星光(北京)科技有限公司 | Client mining method and system based on client behavior characteristics |
| CN116523572B (en) * | 2023-06-28 | 2023-09-08 | 悦享星光(北京)科技有限公司 | Client mining method and system based on client behavior characteristics |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105488697A (en) | Potential customer mining method based on customer behavior characteristics | |
| US9015185B2 (en) | Ontology based recommendation systems and methods | |
| KR100786795B1 (en) | Internet advertising service system and method | |
| EP2704080A1 (en) | Recommendation systems and methods | |
| US20140289239A1 (en) | Recommendation tuning using interest correlation | |
| CN115544242B (en) | Big data-based similar commodity model selection recommendation method | |
| US10679227B2 (en) | Systems and methods for mapping online data to data of interest | |
| Alazab et al. | Maximising competitive advantage on E-business websites: A data mining approach | |
| CN113901308A (en) | Enterprise recommendation method, recommendation device and electronic equipment based on knowledge graph | |
| KR20080026948A (en) | How to extract related keyword groups | |
| CN106530017A (en) | Online store discount coupon automatic acquisition and shopping combination recommendation method | |
| CN119130603A (en) | An interest recommendation algorithm combining user behavior data | |
| JP2007264718A (en) | User interest analysis device, method, program | |
| CN105022830A (en) | Weighting trajectory data set construction method based on user behaviors | |
| Moazzam et al. | Customer opinion mining by comments classification using machine learning | |
| Zhao et al. | Anatomy of a web-scale resale market: a data mining approach | |
| CN120106939A (en) | Commodity search method and its device, equipment and medium | |
| Fitrianah et al. | Analysis of consumer purchase patterns on handphone accessories sales using fp-growth algorithm | |
| KR20100046421A (en) | Method and server for estimating preference of commodity | |
| Granov | Customer loyalty, return and churn prediction through machine learning methods: for a Swedish fashion and e-commerce company | |
| Agrawal et al. | Pros and cons of web mining in E-Commerce | |
| Zhao | The review of web mining in e-commerce | |
| Li et al. | Incorporating both positive and negative association rules into the analysis of outbound tourism in Hong Kong | |
| JP7716202B2 (en) | Information processing device, information processing method, and program | |
| CN112069388B (en) | Entity recommendation method, system, computer device and computer readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160413 |
|
| RJ01 | Rejection of invention patent application after publication |