[go: up one dir, main page]

CN108021658B - Intelligent big data searching method and system based on whale optimization algorithm - Google Patents

Intelligent big data searching method and system based on whale optimization algorithm Download PDF

Info

Publication number
CN108021658B
CN108021658B CN201711252320.6A CN201711252320A CN108021658B CN 108021658 B CN108021658 B CN 108021658B CN 201711252320 A CN201711252320 A CN 201711252320A CN 108021658 B CN108021658 B CN 108021658B
Authority
CN
China
Prior art keywords
whale
big data
group
user
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711252320.6A
Other languages
Chinese (zh)
Other versions
CN108021658A (en
Inventor
叶志伟
杨娟
王春枝
王若曦
胡志勇
金灿
徐萍
谭敏
郑逍
孙一恒
侯亚君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University of Technology
Wuhan Fiberhome Technical Services Co Ltd
Original Assignee
Hubei University of Technology
Wuhan Fiberhome Technical Services Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University of Technology, Wuhan Fiberhome Technical Services Co Ltd filed Critical Hubei University of Technology
Priority to CN201711252320.6A priority Critical patent/CN108021658B/en
Publication of CN108021658A publication Critical patent/CN108021658A/en
Application granted granted Critical
Publication of CN108021658B publication Critical patent/CN108021658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种基于鲸鱼优化算法的大数据智能搜索方法及系统,利用鲸鱼优化算法对大数据智能搜索问题优化求解,从而快速地获得最接近用户需求的搜索结果,进一步提高智能搜索引擎的正确率和效率;本发明能够在可接受的时间代价内找到大数据智能搜索问题高质量的可行解,根据用户的搜索条件,从引擎数据库中得到和用户输入关键词相关的大数据,由用户不断地对搜索结果打分,逐步得到满足用户个性化的需求的搜索效果。本发明利用鲸鱼优化算法进行大数据智能搜索,基于本体特征建立了一个具有智能化的搜索引擎,快速高效地找到最符合用户需要的搜索结果,解决了当前搜索引擎不能很好地为用户提供感兴趣信息的问题且搜索效率高。

Figure 201711252320

The invention discloses a big data intelligent search method and system based on a whale optimization algorithm. The whale optimization algorithm is used to optimize and solve the big data intelligent search problem, thereby quickly obtaining the search results closest to the user's needs, and further improving the performance of the intelligent search engine. Accuracy and efficiency; the present invention can find a high-quality feasible solution to the big data intelligent search problem within an acceptable time cost. According to the user's search conditions, the big data related to the keyword input by the user is obtained from the engine database, and the user Continuously score the search results, and gradually obtain the search effect that meets the individual needs of users. The present invention uses the whale optimization algorithm to search big data intelligently, establishes an intelligent search engine based on ontology features, and quickly and efficiently finds the search results that best meet the user's needs, solving the problem that the current search engine cannot provide users with a good sense of The problem of interest information and high search efficiency.

Figure 201711252320

Description

一种基于鲸鱼优化算法的大数据智能搜索方法及系统A Big Data Intelligent Search Method and System Based on Whale Optimization Algorithm

技术领域technical field

本发明属于大数据和智能计算的交叉应用领域,涉及一种智能优化算法在大数据领域的应用,尤其涉及一种在大数据智能搜索问题上的解决方法,具体涉及一种基于鲸鱼优化算法的大数据智能搜索方法及系统。The invention belongs to the cross-application field of big data and intelligent computing, and relates to the application of an intelligent optimization algorithm in the field of big data, in particular to a solution to the problem of intelligent search of big data, and in particular to a whale-based optimization algorithm Big data intelligent search method and system.

背景技术Background technique

近几年,大数据在全球得到了迅猛发展,同时引起了学术、业界以及各国政府的高度重视。通过有效地管理大数据并通过分析获取其价值,能够提供高附加值的应用和服务,实现更多的经济和社会价值。但是,大数据时代的来临,既带来了重大的发展机遇又带来了技术挑战。传统的计算技术在解决大数据问题处理时会面临诸多的技术困难。因此,需要研究和寻找新的有效技术方法,以完成大数据的分析处理和价值发现。In recent years, big data has developed rapidly around the world, and at the same time it has attracted great attention from academics, industry and governments. By effectively managing big data and obtaining its value through analysis, high value-added applications and services can be provided to achieve more economic and social value. However, the advent of the era of big data has brought both major development opportunities and technical challenges. Traditional computing technology will face many technical difficulties when solving big data problems. Therefore, it is necessary to research and find new effective technical methods to complete the analysis and processing of big data and value discovery.

不同于传统的数据库查询,网络搜索的一个特点是,用户往往不能一下子完全表达自己的需求,而是通过和搜索引擎的多次交互、逐步逼近,才能基本达到用户请求的目标。这样会导致效率很低。搜索系统能够在用户第一次提出查询请求时,就通过大数据分析洞察其含义,对查询表达式自动做语义扩展,一次中的就可大大提高效率,减轻用户负担。Different from traditional database queries, one of the characteristics of web search is that users often cannot fully express their needs at once, but only through multiple interactions with search engines and gradually approaching, can they basically achieve the goals requested by users. This can lead to very low efficiency. The search system can gain insight into the meaning of the query expression through big data analysis when the user makes the query request for the first time, and automatically extend the semantics of the query expression, which can greatly improve efficiency and reduce the burden on users in one go.

网络搜索引擎是网上信息查询的一个有力的工具,是网络信息检索的关键技术。传统搜索引擎主要包括:目录类搜索引擎、全文搜索引擎、元搜索引擎、集合式搜索引擎和垂直搜索引擎。随着信息格式的多样化和信息数量的激增,传统搜索引擎面临着巨大的挑战,它已经不能满足用户更个性化、智能化以及多样化的需要。从信息的搜集到信息的组织与索引以及信息的检索与用户接口,智能搜索引擎在不断优化传统搜索引擎的各个方面。智能搜索引擎主要包括:基于本体的智能搜索引擎、基于知识库系统的智能搜索引擎和基于语义关联的智能搜索引擎。智能搜索引擎是将基于关键词层面检索的传统搜索引擎提高到基于知识或概念层面来检索的搜索引擎,它可以从知识和概念的角度来理解词语,表现出较强的智能化与个性化特色,它以一定的知识库技术基础,具有很高的自然语言理解与知识处理能力。The web search engine is a powerful tool for online information query and a key technology of web information retrieval. Traditional search engines mainly include: catalog search engines, full-text search engines, meta search engines, collection search engines and vertical search engines. With the diversification of information formats and the surge in the amount of information, traditional search engines are facing enormous challenges, and they can no longer meet the more personalized, intelligent and diverse needs of users. From information collection to information organization and indexing, information retrieval and user interface, intelligent search engines are constantly optimizing all aspects of traditional search engines. Intelligent search engines mainly include: ontology-based intelligent search engines, knowledge base system-based intelligent search engines and semantic association-based intelligent search engines. An intelligent search engine is a search engine that improves the traditional search engine based on keyword level retrieval to knowledge or concept level retrieval. It can understand words from the perspective of knowledge and concepts, showing strong intelligence and personalization characteristics , it has a high natural language understanding and knowledge processing ability based on a certain knowledge base technology.

大数据搜索引擎经过了20多年的发展,在搜索速度及正确率都有较大的改善,但是其基本架构和技术都没有太大的改变,这也就造成了它的局限性。针对不同应用场合的搜索引擎,需要的信息也是千差万别的。非智能化检索缺乏识别用户感兴趣信息的能力,并且排序方式不能根据不同使用者做出相应的调整。为不同的人提供具有针对性的检索服务,是未来搜索引擎发展的道路之一,因而各种智能化搜索技术由此诞生。人工智能和计算智能等学科的诞生,是研究者们试图从人类思维和生物界的一些规律中得到启发,创建相应的计算模型,应用到信息科学中去。人工神经网络、群体智能、基因计算等都是运用到当前大数据挑战中的一些成功范例。结合本体特征来建立一个具有智能化的搜索引擎,既解决了目前搜索引擎不能很好地为用户提供感兴趣信息的问题,同时还提高了搜索效率。After more than 20 years of development, the big data search engine has greatly improved in search speed and accuracy, but its basic structure and technology have not changed much, which has caused its limitations. Search engines for different applications require different information. Non-intelligent retrieval lacks the ability to identify information that users are interested in, and the sorting method cannot be adjusted according to different users. Providing targeted search services for different people is one of the roads for the development of search engines in the future, so various intelligent search technologies were born. The birth of disciplines such as artificial intelligence and computational intelligence is that researchers try to get inspiration from human thinking and some laws of the biological world, create corresponding computing models, and apply them to information science. Artificial neural networks, swarm intelligence, and genetic computing are some of the successful examples applied to current big data challenges. Combining ontology features to build an intelligent search engine not only solves the problem that current search engines cannot provide users with interesting information, but also improves search efficiency.

发明内容Contents of the invention

为了解决大数据智能搜索问题,本发明提出了一种基于鲸鱼优化算法的大数据智能搜索方法及系统。In order to solve the problem of big data intelligent search, the present invention proposes a big data intelligent search method and system based on whale optimization algorithm.

本发明的方法所采用的技术方案是:一种基于鲸鱼优化算法的大数据智能搜索方法,其特征在于,包括以下步骤:The technical solution adopted by the method of the present invention is: a large data intelligent search method based on a whale optimization algorithm, characterized in that it comprises the following steps:

步骤1:读入用户的搜索条件,根据用户的搜索条件从引擎数据库中得到和用户输入关键词匹配的大数据,每个大数据是一头鲸鱼,鲸群中的第i个鲸鱼当前位置Xi,初始化鲸群的位置:

Figure BDA0001492010260000021
n表示维度,N表示鲸群大小;Step 1: Read in the user's search conditions, and get the big data matching the keywords entered by the user from the engine database according to the user's search conditions. Each big data is a whale, and the current position of the i-th whale in the whale group is X i , initialize the position of the whale group:
Figure BDA0001492010260000021
n represents the dimension, and N represents the size of the whale group;

步骤2:初始化鲸鱼优化算法所需的参数,包括鲸群大小N,对数螺旋形状常数b,当前迭代次数j,最大迭代次数M,整个鲸群全局最优位置为G;Step 2: Initialize the parameters required by the whale optimization algorithm, including the size of the whale group N, the logarithmic spiral shape constant b, the current iteration number j, the maximum number of iterations M, and the global optimal position of the entire whale group is G;

步骤3:计算鲸鱼优化算法中鲸群的初始位置的适应度函数值,将适应度函数值评价最高的大数据作为当前鲸群个体最佳空间位置

Figure BDA0001492010260000022
Step 3: Calculate the fitness function value of the initial position of the whale group in the whale optimization algorithm, and use the big data with the highest fitness function value as the best spatial position of the current whale group individual
Figure BDA0001492010260000022

步骤4:计算系数向量A和C;Step 4: Calculate coefficient vectors A and C;

步骤5:产生一个取值范围为[0,1]的随机数p,并根据p的取值选择不同的更新鲸群空间位置的方式;Step 5: Generate a random number p with a value range of [0,1], and choose different ways to update the spatial position of the whale group according to the value of p;

步骤6:将更新后鲸群的位置向量解码成相应的大数据展现给用户,用户根据自己的搜索条件,为得到的大数据打分,作为适应度函数值;找到并保存当前群体中最佳鲸群个体X*Step 6: Decode the updated position vector of the whale group into corresponding big data and present it to the user. The user scores the obtained big data according to their own search conditions as the fitness function value; find and save the best whale in the current group group individual X * ;

步骤7:通过比较更新前后鲸群的位置向量对应的适应度函数值,确定下一代鲸群的位置;Step 7: Determine the position of the next generation of whales by comparing the fitness function values corresponding to the position vectors of the whales before and after the update;

步骤8:记录符合度最高的大数据对应的鲸群位置为全局最优解G以及其适应度函数值;Step 8: Record the position of the whale group corresponding to the big data with the highest degree of conformity as the global optimal solution G and its fitness function value;

步骤9:判断用户是否在引擎中找到了需要的文本文档;Step 9: Determine whether the user has found the desired text document in the engine;

若否,则令j=j+1并回转执行步骤4;If not, then make j=j+1 and go back to step 4;

若是,则输出最优鲸群个体适应度值及所处的位置X*对应的大数据。If so, then output the optimal whale group individual fitness value and the big data corresponding to the position X * .

本发明的系统所采用的技术方案是:其特征在于:包括输入模块、鲸鱼优化算法初始化模块、适应度函数值模块、系数向量计算模块、鲸群空间位置更新方式选择模块、更新后的鲸群空间位置向量适应度值计算模块、下一代鲸群的位置确定模块、鲸群位置为全局最优解G以及其适应度函数值记录模块、判断模块;The technical solution adopted by the system of the present invention is: it is characterized in that: comprising input module, whale optimization algorithm initialization module, fitness function value module, coefficient vector calculation module, whale group spatial position update mode selection module, updated whale group Spatial position vector fitness value calculation module, position determination module of the next generation whale group, global optimal solution G for the position of the whale group, and its fitness function value recording module and judgment module;

所述输入模块:用于读入用户的搜索条件,根据用户的搜索条件从引擎数据库中得到和用户输入关键词匹配的大数据,每个大数据是一头鲸鱼,鲸群中的第i个鲸鱼当前位置Xi,初始化鲸群的位置:

Figure BDA0001492010260000031
n表示维度,N表示鲸群大小;Described input module: be used for reading in user's search condition, obtain the big data that matches with user's input keyword from engine database according to user's search condition, each big data is a whale, the ith whale in the whale group The current position X i , initialize the position of the whale group:
Figure BDA0001492010260000031
n represents the dimension, and N represents the size of the whale group;

所述鲸鱼优化算法初始化模块:用于初始化鲸鱼优化算法所需的参数,包括鲸群大小N,对数螺旋形状常数b,当前迭代次数j,最大迭代次数M,整个鲸群全局最优位置为G;Described whale optimization algorithm initialization module: be used for initializing the required parameter of whale optimization algorithm, comprise whale group size N, logarithmic spiral shape constant b, current number of iterations j, maximum number of iterations M, the global optimum position of whole group of whales is G;

所述适应度函数值模块:用于计算鲸鱼优化算法中鲸群的初始位置的适应度函数值,将适应度函数值评价最高的大数据作为当前鲸群个体最佳空间位置

Figure BDA0001492010260000032
The fitness function value module: used to calculate the fitness function value of the initial position of the whale group in the whale optimization algorithm, and use the big data with the highest evaluation of the fitness function value as the best spatial position of the current whale group individual
Figure BDA0001492010260000032

所述系数向量计算模块:用于计算系数向量A和C;The coefficient vector calculation module: used to calculate coefficient vectors A and C;

所述鲸群空间位置更新方式选择模块:用于产生一个取值范围为[0,1]的随机数p,并根据p的取值选择不同的更新鲸群空间位置的方式;The whale group spatial position update mode selection module: used to generate a random number p whose value range is [0,1], and select different ways to update the whale group spatial position according to the value of p;

所述更新后的鲸群空间位置向量适应度值计算模块:用于将更新后鲸群的位置向量解码成相应的大数据展现给用户,用户根据自己的搜索条件,为得到的大数据打分,作为适应度函数值;找到并保存当前群体中最佳鲸群个体X*The updated whale group spatial position vector fitness value calculation module: used to decode the updated position vector of the whale group into corresponding big data to present to the user, and the user scores the obtained big data according to his own search conditions, As the fitness function value; find and save the best whale group individual X * in the current group;

所述下一代鲸群的位置确定模块:用于通过比较更新前后鲸群的位置向量对应的适应度函数值,确定下一代鲸群的位置;The position determination module of the next-generation whale group: for determining the position of the next-generation whale group by comparing the fitness function value corresponding to the position vector of the whale group before and after updating;

所述鲸群位置为全局最优解G以及其适应度函数值记录模块:用于记录符合度最高的大数据对应的鲸群位置为全局最优解G以及其适应度函数值;The position of the group of whales is the global optimal solution G and its fitness function value recording module: the position of the group of whales corresponding to the big data with the highest degree of conformity is the global optimal solution G and its fitness function value;

所述判断模块:用于判断用户是否在引擎中找到了需要的文本文档;The judging module: used to judge whether the user has found the desired text document in the engine;

若否,则令j=j+1并回转执行步骤4;If not, then make j=j+1 and go back to step 4;

若是,则输出最优鲸群个体适应度值及所处的位置X*对应的大数据。If so, then output the optimal whale group individual fitness value and the big data corresponding to the position X * .

本发明的有益效果是:鲸鱼优化算法具有调节参数少、收敛速度快、寻优精度高、全局寻优能力强以及收敛稳定性好的特点,利用鲸鱼优化算法对大数据智能搜索问题优化求解,从而快速地获得最接近用户需求的搜索结果,具有良好的正确率和效率。The beneficial effects of the present invention are: the whale optimization algorithm has the characteristics of few adjustment parameters, fast convergence speed, high optimization precision, strong global optimization ability and good convergence stability, and the whale optimization algorithm is used to optimize and solve big data intelligent search problems, Therefore, the search results closest to the user's needs can be quickly obtained, with good accuracy and efficiency.

附图说明Description of drawings

图1是本发明实施例的方法流程图。Fig. 1 is a flow chart of the method of the embodiment of the present invention.

具体实施方式Detailed ways

为了便于本领域普通技术人员理解和实施本发明,下面结合附图及实施例对本发明作进一步的详细描述,应当理解,此处所描述的实施示例仅用于说明和解释本发明,并不用于限定本发明。In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

请见图1,本发明提供的一种基于鲸鱼优化算法的大数据智能搜索方法,包括以下步骤:Please see Fig. 1, a kind of big data intelligent search method based on whale optimization algorithm provided by the present invention, comprises the following steps:

步骤1:读入用户的搜索条件,根据用户的搜索条件从引擎数据库中得到和用户输入关键词匹配的大数据,每个大数据是一头鲸鱼,鲸群中的第i个鲸鱼当前位置Xi,初始化鲸群的位置:

Figure BDA0001492010260000041
n表示维度,N表示鲸群大小;Step 1: Read in the user's search conditions, and get the big data matching the keywords entered by the user from the engine database according to the user's search conditions. Each big data is a whale, and the current position of the i-th whale in the whale group is X i , initialize the position of the whale group:
Figure BDA0001492010260000041
n represents the dimension, and N represents the size of the whale group;

步骤2:设置鲸鱼优化算法所需的参数,所述的初始化鲸鱼优化算法所需的参数包括鲸群大小N,对数螺旋形状常数b,当前迭代次数j,最大迭代次数M,整个鲸群全局最优位置为G。Step 2: Set the parameters required for the whale optimization algorithm. The parameters required for the initialization of the whale optimization algorithm include the size of the whale group N, the logarithmic spiral shape constant b, the current number of iterations j, the maximum number of iterations M, and the global value of the entire whale group The best position is G.

步骤3:将鲸鱼优化算法中鲸群的初始位置向量解码成相应的大数据展现给用户,用户根据自己的搜索条件,为得到的大数据打分,即为适应度函数值。引擎将适应度函数值评价最高的大数据作为当前鲸群个体最佳空间位置

Figure BDA0001492010260000051
Step 3: Decode the initial position vector of the whale group in the whale optimization algorithm into corresponding big data and present it to the user. The user scores the obtained big data according to their own search conditions, which is the fitness function value. The engine takes the big data with the highest evaluation of the fitness function value as the best spatial position of the current whale population
Figure BDA0001492010260000051

步骤4:计算系数向量A和C。Step 4: Calculate coefficient vectors A and C.

系数向量A的计算公式为:The calculation formula of the coefficient vector A is:

A=2a×r-aA=2a×r-a

Figure BDA0001492010260000052
Figure BDA0001492010260000052

其中,M为最大迭代次数,r为取值范围在[0,1]之间的随机向量。Among them, M is the maximum number of iterations, and r is a random vector with a value range between [0,1].

系数向量C的计算公式为:The calculation formula of the coefficient vector C is:

C=2rC=2r

其中,r为随机向量,取值范围为[0,1]。Among them, r is a random vector with a value range of [0,1].

步骤5:产生一个随机数p,取值范围为[0,1],并根据p的取值选择不同的更新鲸群空间位置的方式。Step 5: Generate a random number p with a value range of [0,1], and choose different ways to update the spatial position of the whale group according to the value of p.

当p<0.5时,若A<1,更新当前鲸群个体的空间位置的公式为:When p<0.5, if A<1, the formula for updating the spatial position of the current individual whale group is:

Xj+1=Xj-A×DX j+1 =X j -A×D

Figure BDA0001492010260000054
Figure BDA0001492010260000054

其中,j为当前的迭代次数,Xj为当前鲸群个体空间位置,A和C为系数向量,

Figure BDA0001492010260000055
为当前鲸群个体最佳空间位置。Among them, j is the current number of iterations, X j is the spatial position of the current individual whale group, A and C are coefficient vectors,
Figure BDA0001492010260000055
is the best spatial position of the current whale group individual.

当p<0.5时,若A≥1,则从当前群体中随机选择鲸群个体位置Xrand,并更新当前鲸群个体的空间位置。更新当前鲸群个体的空间位置公式为:When p<0.5, if A≥1, randomly select the position X rand of the whale group individual from the current group, and update the spatial position of the current whale group individual. The formula for updating the spatial position of the current individual whale group is:

X=Xrand-A×DX=X rand -A×D

D=|C×Xrand,j-Xj|D=|C×X rand,j -X j |

其中,Xrand为当前鲸群中随机选择的位置,即随机鲸群个体;Xrand,j为当前鲸群中第j维随机选择的位置;Among them, X rand is a randomly selected position in the current whale group, that is, a random whale group individual; X rand,j is a randomly selected position in the jth dimension of the current whale group;

当p≥0.5时,更新当前鲸群个体的空间位置公式为:When p≥0.5, the formula for updating the spatial position of the current whale population is:

Figure BDA0001492010260000053
Figure BDA0001492010260000053

Figure BDA0001492010260000061
Figure BDA0001492010260000061

其中,D′为鲸群第i头鲸目前最佳位置到猎物之间的距离,b为定义的对数螺旋形状常数,l为[-1,1]之间的随机数,Xj为当前鲸群个体空间位置,

Figure BDA0001492010260000062
为当前鲸群个体最佳空间位置。Among them, D′ is the distance between the current best position of the i-th whale in the whale group and the prey, b is the defined logarithmic spiral shape constant, l is a random number between [-1,1], X j is the current The individual spatial position of the whale group,
Figure BDA0001492010260000062
is the best spatial position of the current whale group individual.

步骤6:将更新后鲸群的位置向量解码成相应的大数据展现给用户,用户根据自己的搜索条件,为得到的大数据打分,作为适应度函数值。找到并保存当前群体中最佳鲸群个体X*Step 6: Decode the updated position vector of the whale group into corresponding big data and present it to the user. The user scores the obtained big data according to their own search conditions as the fitness function value. Find and save the best whale group individual X * in the current group.

步骤7:通过比较更新前后鲸群的位置向量对应的适应度函数值,确定下一代鲸群的位置。确定规则为:若更新后的鲸群的位置向量对应的适应度函数值高于更新前,则替换原先的鲸群;否则,保留更新前的鲸群。适应度函数值的计算方法同步骤3。Step 7: Determine the position of the next generation of whales by comparing the fitness function values corresponding to the position vectors of the whales before and after the update. The determination rule is: if the fitness function value corresponding to the position vector of the updated whale group is higher than that before the update, replace the original whale group; otherwise, keep the whale group before the update. The calculation method of the fitness function value is the same as step 3.

步骤8:记录符合度最高的大数据对应的鲸群位置为全局最优解G以及其适应度函数值。Step 8: Record the position of the whale group corresponding to the big data with the highest degree of conformity as the global optimal solution G and its fitness function value.

步骤9:判断用户是否在引擎中找到了需要的文本文档;Step 9: Determine whether the user has found the desired text document in the engine;

若否,则令j=j+1并回转执行步骤4;If not, then make j=j+1 and go back to step 4;

若是,则输出最优鲸群个体适应度值及所处的位置X*对应的大数据。If so, then output the optimal whale group individual fitness value and the big data corresponding to the position X * .

本发明通过利用鲸鱼优化算法对大数据智能搜索问题优化求解,从而快速地获得最接近用户需求的搜索结果,该方法可用于大数据和智能计算相关技术领域中。The invention uses the whale optimization algorithm to optimize and solve the big data intelligent search problem, thereby quickly obtaining the search result closest to the user's needs, and the method can be used in the technical fields related to big data and intelligent computing.

应当理解的是,本说明书未详细阐述的部分均属于现有技术。It should be understood that the parts not described in detail in this specification belong to the prior art.

应当理解的是,上述针对较佳实施例的描述较为详细,并不能因此而认为是对本发明专利保护范围的限制,本领域的普通技术人员在本发明的启示下,在不脱离本发明权利要求所保护的范围情况下,还可以做出替换或变形,均落入本发明的保护范围之内,本发明的请求保护范围应以所附权利要求为准。It should be understood that the above-mentioned descriptions for the preferred embodiments are relatively detailed, and should not therefore be considered as limiting the scope of the patent protection of the present invention. Within the scope of protection, replacements or modifications can also be made, all of which fall within the protection scope of the present invention, and the scope of protection of the present invention should be based on the appended claims.

Claims (7)

1. The big data intelligent searching method based on whale optimization algorithm is characterized by comprising the following steps:
step 1: reading in the search condition of the user, obtaining big data matched with the keyword input by the user from an engine database according to the search condition of the user, wherein each big data is one whale, and the current position X of the ith whale in the whale group i Initializing the position of whale groups:
Figure FDA0001492010250000011
i=1, 2,..n, N represents the dimension, N represents the whale size;
step 2: parameters required by initializing a whale optimization algorithm comprise whale group size N, logarithmic spiral shape constant b, current iteration times j and maximum iteration times M, and the global optimal position of the whole whale group is G;
step 3: calculating the fitness function value of the initial position of the whale group in the whale optimization algorithm, and taking the big data with the highest fitness function value evaluation as the optimal spatial position of the current whale group individual
Figure FDA0001492010250000012
Step 4: calculating coefficient vectors A and C;
step 5: generating a random number p with a value range of [0,1], and selecting different ways for updating the space position of whale group according to the value of p;
step 6: decoding the position vector of the updated whale group into corresponding big data, and displaying the corresponding big data to a user, wherein the user scores the obtained big data according to the search condition of the user, and the big data is used as an fitness function value; finding and preserving the best whale individuals X in the current population *
Step 7: determining the position of the next generation whale group by comparing fitness function values corresponding to the position vectors of the whale groups before and after updating;
step 8: recording the whale group position corresponding to the big data with highest coincidence degree as a global optimal solution G and an fitness function value thereof;
step 9: judging whether the user finds a required text document in the engine;
if not, j=j+1 is caused and the step 4 is executed in a revolving way;
if yes, outputting the optimal whale individual fitness value and the position X thereof * Corresponding big data.
2. The whale optimization algorithm-based big data intelligent search method according to claim 1, wherein the method comprises the following steps: in step 3, the fitness function value is calculated by decoding the position vector of the whale group into corresponding big data and displaying the corresponding big data to the user, and the user scores the obtained big data according to the search condition.
3. The intelligent big data searching method based on whale optimization algorithm according to claim 1, wherein the calculation formula of the coefficient vector a in step 4 is:
A=2a×r-a
Figure FDA0001492010250000021
wherein r is a random vector with a value range of 0 and 1.
4. The intelligent big data searching method based on whale optimization algorithm according to claim 1, wherein the calculation formula of the coefficient vector C in step 4 is:
C=2r
wherein r is a random vector, and the value range is [0,1].
5. The intelligent big data searching method based on whale optimization algorithm according to claim 1, wherein the method of updating whale space position in step 5 is as follows:
when p is less than 0.5, if A is less than 1, updating the formula of the spatial position of the current whale individual to be:
X j+1 =X j -A×D
Figure FDA0001492010250000022
wherein j is the current iteration number, X j For the current whale individual spatial position, a and C are coefficient vectors,
Figure FDA0001492010250000023
the optimal spatial position for the current whale individual;
when p is less than 0.5, if A is more than or equal to 1, randomly selecting whale group individual position X from the current group rand Updating the spatial position of the current whale group individual; updating the spatial position formula of the current whale individual as follows:
X=X rand -A×D
D=|C×X rand,j -X j |
wherein X is rand Randomly selected positions in the current whale, namely random whale individuals; x is X rand,j Randomly selecting a position for the j-th dimension in the current whale;
when p is more than or equal to 0.5, updating the spatial position formula of the current whale group individual as follows:
Figure FDA0001492010250000024
Figure FDA0001492010250000025
wherein D' is the distance from the current best position of the ith whale of the whale group to the prey, b is a defined logarithmic spiral shape constant, and l is a random number between [ -1, 1].
6. The intelligent big data searching method based on whale optimization algorithm according to any one of claims 1-5, wherein the rule for determining the position of next generation whale group in step 7 is: if the fitness function value corresponding to the position vector of the whale group after updating is higher than that before updating, the original whale group is replaced; otherwise, preserving whale groups before updating; the method for calculating the fitness function value is the same as the step 3.
7. A big data intelligent search system based on a whale optimization algorithm is characterized in that: the system comprises an input module, a whale optimization algorithm initialization module, an fitness function value module, a coefficient vector calculation module, a whale space position updating mode selection module, an updated whale space position vector fitness value calculation module, a next generation whale position determination module, a whale position being a global optimal solution G, a fitness function value recording module and a judgment module;
the input module: for reading in the search condition of the user, obtaining big data matched with the key words input by the user from the engine database according to the search condition of the user, wherein each big data is one whale, and the current position X of the ith whale in the whale group i Initializing the position of whale groups:
Figure FDA0001492010250000031
i=1, 2,..n, N represents the dimension, N represents the whale size;
the whale optimization algorithm initialization module: parameters required for initializing a whale optimization algorithm comprise whale group size N, logarithmic spiral shape constant b, current iteration times j and maximum iteration times M, and the global optimal position of the whole whale group is G;
the fitness function value module: the method comprises the steps of calculating fitness function values of initial positions of whales in a whale optimization algorithm, and taking big data with highest fitness function value evaluation as the optimal spatial position of a current whale individual
Figure FDA0001492010250000032
The coefficient vector calculation module: for calculating coefficient vectors a and C;
the whale space position updating mode selection module: a mode for generating a random number p with a value range of [0,1] and selecting different updated whale group space positions according to the value of p;
the updated whale group space position vector fitness value calculating module is used for: the method comprises the steps of decoding the position vector of the whale group after updating into corresponding big data and showing the corresponding big data to a user, and scoring the obtained big data as a fitness function value by the user according to the searching condition of the user; finding and preserving the best whale individuals X in the current population *
The position determining module of the next generation whale group: the method comprises the steps of determining the position of a next generation whale group by comparing fitness function values corresponding to position vectors of whale groups before and after updating;
the whale group position is the global optimal solution G and the fitness function value recording module thereof: the whale group position corresponding to the big data with highest coincidence degree is recorded as a global optimal solution G and an fitness function value thereof;
the judging module is used for: for determining whether the user has found the desired text document in the engine;
if not, j=j+1 is caused and the step 4 is executed in a revolving way;
if yes, outputting the optimal whale individual fitness value and the position X thereof * Corresponding big data.
CN201711252320.6A 2017-12-01 2017-12-01 Intelligent big data searching method and system based on whale optimization algorithm Active CN108021658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711252320.6A CN108021658B (en) 2017-12-01 2017-12-01 Intelligent big data searching method and system based on whale optimization algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711252320.6A CN108021658B (en) 2017-12-01 2017-12-01 Intelligent big data searching method and system based on whale optimization algorithm

Publications (2)

Publication Number Publication Date
CN108021658A CN108021658A (en) 2018-05-11
CN108021658B true CN108021658B (en) 2023-05-26

Family

ID=62078115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711252320.6A Active CN108021658B (en) 2017-12-01 2017-12-01 Intelligent big data searching method and system based on whale optimization algorithm

Country Status (1)

Country Link
CN (1) CN108021658B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108873835A (en) * 2018-06-12 2018-11-23 昆明理工大学 The Optimization Scheduling of photoetching process in a kind of manufacture of semiconductor integrated circuit
CN109345005A (en) * 2018-09-12 2019-02-15 中国电力科学研究院有限公司 A multi-dimensional optimization method for integrated energy system based on improved whale algorithm
CN109931903B (en) * 2019-02-26 2021-07-06 上海大学 A Cylindricity Evaluation Method Based on Improved Whale Optimization Algorithm
CN109886589B (en) * 2019-02-28 2024-01-05 长安大学 Method for solving low-carbon workshop scheduling based on improved whale optimization algorithm
CN109902873A (en) * 2019-02-28 2019-06-18 长安大学 A method for cloud manufacturing resource allocation based on improved whale algorithm
CN109886588B (en) * 2019-02-28 2024-01-02 长安大学 Method for solving flexible job shop scheduling based on improved whale algorithm
CN110059875B (en) * 2019-04-12 2023-02-17 湖北工业大学 Public bicycle demand forecasting method based on distributed whale optimization algorithm
CN110322050B (en) * 2019-06-04 2023-04-07 西安邮电大学 Wind energy resource data compensation method
CN113887691B (en) * 2021-08-24 2022-09-16 杭州电子科技大学 Whale evolution system and method for service composition problem
CN114579801B (en) * 2022-04-28 2022-08-12 深圳市华曦达科技股份有限公司 Long video recommendation method based on eagle optimization algorithm
CN119402433A (en) * 2024-09-24 2025-02-07 华中科技大学 Industrial Internet client network traffic control method, device, equipment and medium
CN119989316B (en) * 2025-04-15 2025-07-29 北京华电数智云链科技有限公司 Multi-subject data collaborative risk supervision system and method based on blockchain

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951501A (en) * 2015-04-27 2015-09-30 安徽大学 Particle swarm algorithm based intelligent big data searching algorithm
CN105074623A (en) * 2013-03-14 2015-11-18 微软技术许可有限责任公司 Presenting object models in augmented reality images
CN107016436A (en) * 2017-03-31 2017-08-04 浙江大学 A kind of mixing whale algorithm of bionical policy optimization

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7266553B1 (en) * 2002-07-01 2007-09-04 Microsoft Corporation Content data indexing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105074623A (en) * 2013-03-14 2015-11-18 微软技术许可有限责任公司 Presenting object models in augmented reality images
CN104951501A (en) * 2015-04-27 2015-09-30 安徽大学 Particle swarm algorithm based intelligent big data searching algorithm
CN107016436A (en) * 2017-03-31 2017-08-04 浙江大学 A kind of mixing whale algorithm of bionical policy optimization

Also Published As

Publication number Publication date
CN108021658A (en) 2018-05-11

Similar Documents

Publication Publication Date Title
CN108021658B (en) Intelligent big data searching method and system based on whale optimization algorithm
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN108052593B (en) A topic keyword extraction method based on topic word vector and network structure
CN111832289B (en) Service discovery method based on clustering and Gaussian LDA
CN113297369B (en) Intelligent Question Answering System Based on Knowledge Graph Subgraph Retrieval
CN105045875B (en) Personalized search and device
CN109241243B (en) Candidate document sorting method and device
CN108647350A (en) Image-text associated retrieval method based on two-channel network
CN109408600B (en) Book recommendation method based on data mining
CN104834679B (en) A kind of expression of action trail, querying method and device
CN107291895B (en) A Fast Hierarchical Document Query Method
CN105183833A (en) User model based microblogging text recommendation method and recommendation apparatus thereof
CN112860898B (en) Short text box clustering method, system, equipment and storage medium
CN109597995A (en) A kind of document representation method based on BM25 weighted combination term vector
CN108563766A (en) The method and device of food retrieval
CN103778206A (en) Method for providing network service resources
CN102915381A (en) Multi-dimensional semantic based visualized network retrieval rendering system and rendering control method
CN110580252A (en) Spatial object index and query method under multi-objective optimization
CN103761286B (en) A kind of Service Source search method based on user interest
CN120179890B (en) Text searching method and system based on vector retrieval and large model optimization
CN117435685A (en) Document retrieval method, document retrieval device, computer equipment, storage medium and product
CN118798366A (en) Military knowledge question and answer generation method and computer system based on knowledge graph
Zaw et al. Web document clustering by using PSO-based cuckoo search clustering algorithm
CN107832319B (en) Heuristic query expansion method based on semantic association network
CN105677830B (en) A kind of dissimilar medium similarity calculation method and search method based on entity mapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant