[go: up one dir, main page]

CN114625952B - Information recommendation method and system based on VSM and AMMK-means - Google Patents

Information recommendation method and system based on VSM and AMMK-means

Info

Publication number
CN114625952B
CN114625952B CN202011432407.3A CN202011432407A CN114625952B CN 114625952 B CN114625952 B CN 114625952B CN 202011432407 A CN202011432407 A CN 202011432407A CN 114625952 B CN114625952 B CN 114625952B
Authority
CN
China
Prior art keywords
information
user
item
browsed
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011432407.3A
Other languages
Chinese (zh)
Other versions
CN114625952A (en
Inventor
彭石宝
曹郁
焦峰
王炜华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
93216 Troops Of Chinese Pla
Original Assignee
93216 Troops Of Chinese Pla
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 93216 Troops Of Chinese Pla filed Critical 93216 Troops Of Chinese Pla
Priority to CN202011432407.3A priority Critical patent/CN114625952B/en
Publication of CN114625952A publication Critical patent/CN114625952A/en
Application granted granted Critical
Publication of CN114625952B publication Critical patent/CN114625952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an information recommendation method and system based on VSM and AMMK-means, comprising the steps of obtaining object images of candidate information, substituting the object images of the candidate information into a pre-constructed interest model to obtain the similarity between the candidate information and a user image, and recommending the candidate information with the highest similarity to the user, wherein the interest model is constructed based on the object images of the VSM, the AMMK-means and the information browsed by the user. Because the interest model is constructed based on the VSM, the AMMK-means and the item portrait of the information browsed by the user, which is equivalent to customizing based on the item portrait of the user's interest, the invention avoids the deviation between the information category recommended to the user and the information category actually interested by the user according to the classification standard set by the editor, and improves the recommendation accuracy compared with the traditional collaborative filtering algorithm.

Description

Information recommendation method and system based on VSM and AMMK-means
Technical Field
The invention relates to the field of information retrieval, in particular to an information recommendation method and system based on VSM and AMMK-means.
Background
The situation assessment refers to the analysis, reasoning and judgment of multi-source information based on the relation between object assessment understanding battlefield and is used for supporting command layer decision. Because battlefield information has the characteristics of large data volume and multiple data types, each seat (namely a user) has the problem of difficult information selection, and an auxiliary decision-making system is required to recommend interesting information for different seats according to browsing records of each seat and characteristics of the battlefield information.
Recommendation algorithms, typically based on collaborative filtering, make recommendations based on user browsing records or feedback records. In addition to the browse records or feedback records according to users, some studies consider increasing information categories when generating recommended information, but the information categories are manually classified by information editors, and classification standards can only represent edited opinions, thereby causing deviation of recommendation effects.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides an information recommendation method and system based on VSM and AMMK-means.
In a first aspect, an information recommendation method based on VSM and AMMK-means is provided, including:
acquiring an item portrait of each candidate information;
Substituting the item portraits of the candidate information into a pre-constructed interest model to obtain the similarity between the candidate information and the user portraits;
recommending the candidate information with the highest similarity to a user;
the interest model is constructed based on the VSM, AMMK-means and the item representation of the information that the user has browsed.
Preferably, the construction of the interest model includes:
Acquiring an article portrait of information which has been browsed by a user, and characterizing the article portrait of the information which has been browsed by the user by utilizing a VSM;
clustering the object portraits of the information browsed by the user through AMMK-means, and taking a clustering result as an information category of interest of the user;
Respectively calculating the weight of each information category according to the information quantity browsed by the user and the total information quantity browsed by the user contained in each information category;
generating a user representation based on the information categories of interest to the user and the weights of the information categories;
and calculating the similarity between the item portrait of the candidate information and the item portrait in the user portrait.
Further, the clustering of the item portraits of the information browsed by the user through the AMMK-means uses the clustering result as the information category of interest of the user, and the clustering method comprises the following steps:
generating a dataset based on the representation of the item of information that the user has browsed;
determining a clustering center and the number of the clustering centers by using a maximum and minimum distance clustering algorithm for samples in the data set;
Taking the number of the clustering centers as a K value in a K-means algorithm, and taking all the obtained clustering centers as initial clustering centers in the K-means clustering algorithm;
Based on the distance between each sample in the dataset and each initial clustering center, a clustering result is obtained when the set constraint condition is met;
and taking the clustering result as the information category of interest to the user.
Further, the determining the clustering center and the number of the clustering centers for the samples in the dataset by using a maximum and minimum distance clustering algorithm includes:
calculating the average value of the sample attributes, calculating the distance between each sample and the average value, and taking the sample corresponding to the minimum value of the distance as a first clustering center C 1;
Selecting a sample farthest from C 1 as a second cluster center C 2;
Calculating distances D i1 and D i2 from all remaining samples to C 1 and C 2, if D l=max{min(Di1,Di2), i=1, 2,..n }, and D l>θD12, θ is a given value, D 12 is the distance between C 1 and C 2, taking x l as a third cluster center C 3;
if C 3 is present, calculate D j=max{min(Di1,Di2,Di3), i=1, 2,..n, if D j>θD12, establish a fourth cluster center;
and analogically, ending the calculation of searching the cluster centers until the maximum and minimum distances are not more than thetad 12, and obtaining the cluster centers and the number of the cluster centers.
Preferably, the expression of the interest model is as follows:
Vseat=(w1*T1,w2*T2,...,wm*Tm)T
Wherein V seat represents a user image, w m represents a weight of the mth information category, and T m represents a feature vector of the mth information category.
Preferably, the similarity is calculated as follows:
Wherein, the seal is the item portrait in the user portrait, w i is the weight of the information category to which the candidate information d i belongs, T i T is the feature vector of the information category to which the candidate information d i belongs, Is the eigenvector of d i.
In a second aspect, there is provided an information recommendation system based on VSM and AMMK-means, comprising:
The acquisition module is used for acquiring the object portraits of each piece of candidate information;
the similarity calculation module is used for substituting the item portraits of the candidate information into a pre-constructed interest model to obtain the similarity between the candidate information and the user portraits;
the recommending module is used for recommending the candidate information with the highest similarity to the user;
the interest model is constructed based on the VSM, AMMK-means and the item representation of the information that the user has browsed.
Preferably, the system further comprises a building module of an interest model, wherein the building module of the interest model comprises:
The first construction unit is used for acquiring the article portraits of the information which the user has browsed and characterizing the article portraits of the information which the user has browsed by utilizing the VSM;
the information category construction unit is used for clustering the object portraits of the information browsed by the user through AMMK-means, and taking the clustering result as the information category of interest to the user;
a weight calculation unit for calculating the weight of each information category according to the number of information browsed by the user and the total number of information browsed by the user contained in each information category;
A user portrait construction unit, which is used for generating a user portrait based on the information category of interest of the user and the weight of the information category;
and the calculating unit is used for calculating the similarity between the article portrait of the candidate information and the article portrait in the user portrait.
In a third aspect, a storage device is provided, in which a plurality of program codes are stored, the program codes are adapted to be loaded and executed by a processor to perform the information recommendation method based on VSM and AMMK-means according to any of the above technical solutions.
In a fourth aspect, a control device is provided, including a processor and a storage device, the storage device being adapted to store a plurality of program codes, the program codes being adapted to be loaded and executed by the processor to perform the information recommendation method based on VSM and AMMK-means according to any of the above-mentioned aspects.
The technical scheme provided by the invention has at least one or more of the following beneficial effects:
In this embodiment, first, an item representation of each candidate information is acquired; the method comprises the steps of obtaining object images of all candidate information, substituting the object images of all candidate information into a pre-constructed interest model to obtain the similarity between all candidate information and the user image, recommending the candidate information with the highest similarity to a user, and because the interest model in the embodiment is constructed based on VSM, AMMK-means and the object images of the information browsed by the user, the method is equivalent to customizing based on the object images of interest of the user, avoids deviation between information types recommended to the user according to classification standards set by editors and information types actually interested by the user, and improves recommendation accuracy compared with a traditional collaborative filtering algorithm.
Drawings
FIG. 1 is a flow chart illustrating the main steps of a method for information recommendation based on VSM and AMMK-means according to an embodiment of the present invention;
FIG. 2 is a flowchart of the main steps of the modified algorithm AMMK-means according to the embodiment of the present invention;
FIG. 3 is a main block diagram of an information recommendation method based on VSM and AMMK-means according to an embodiment of the present invention.
Detailed Description
For a better understanding of the present invention, reference is made to the following description, drawings and examples.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order to solve the problems of the existing information recommendation model, the embodiment provides an information recommendation method based on VSM and AMMK-means, wherein AMMK-means is used for improving a K-means clustering algorithm based on a maximum and minimum distance clustering algorithm, an interest model for representing information content is built according to important influences of information content and information category on user interest, quantized information is classified at the same time, in the process, the VSM is firstly adopted to represent text information of the information, then AMMK-means algorithm is adopted to cluster information, an interest model of the user is built, similarity calculation is carried out on the interest model of the user and candidate information, and the information of interest is recommended to the user. The embodiment avoids the problems existing in editing classification by utilizing the improved clustering algorithm while inheriting the explanatory advantage of the recommendation algorithm of the existing content. Experiments prove that compared with the traditional collaborative filtering algorithm, the prediction accuracy of the method provided by the embodiment is high.
In the embodiment of the invention, referring to fig. 1, fig. 1 is a flowchart of an information recommendation method based on VSM and AMMK-means. As shown in FIG. 1, the information recommendation method based on VSM and AMMK-means in the embodiment of the invention mainly comprises the following steps:
s1, acquiring an article portrait of each piece of candidate information;
s2, substituting the object portraits of the candidate information into a pre-constructed interest model to obtain the similarity between the candidate information and the user portraits;
s3, recommending the candidate information with the highest similarity to a user;
the interest model is constructed based on the VSM, AMMK-means and the item representation of the information that the user has browsed.
In this embodiment, the information fusion is also called data fusion or multi-sensor fusion, and may be defined as an information processing process that uses computer technology to automatically analyze and integrate several sensor observation information obtained in time sequence under a certain criterion so as to complete task decision and evaluation.
In this embodiment, the information keyword refers to the most representative word in the information, which can characterize the uniqueness and uniqueness of the information, and is generally extracted by a text processing algorithm.
In this embodiment, the information feature vector is usually represented by a vector d= { w 1,w2,...,wi,...wn } because the content included in the information belongs to the text type, and the result of vectorization of the information text is called an information feature vector.
In one embodiment, the construction of the interest model in S2 includes:
Acquiring an article portrait of information which has been browsed by a user, and characterizing the article portrait of the information which has been browsed by the user by utilizing a VSM;
clustering the object portraits of the information browsed by the user through AMMK-means, and taking a clustering result as an information category of interest of the user;
Respectively calculating the weight of each information category according to the information quantity browsed by the user and the total information quantity browsed by the user contained in each information category;
generating a user representation based on the information categories of interest to the user and the weights of the information categories;
and calculating the similarity between the item portrait of the candidate information and the item portrait in the user portrait.
The VSM may be utilized in this embodiment to characterize representations of items of information that a user has viewed, including:
In the vector space model VSM, each document uses a feature vector to represent multidimensional information in the document, the feature vector is an article portrait, and the article portrait for constructing the information not only can embody high dimensionality of the information, but also is convenient to be used as clustering information for clustering so as to construct a user interest model;
The present embodiment adopts a vector space model to represent information feature vectors, and given an information set x= { X 1,X2,...,Xi,...Xn }, vectorization of information is represented as:
Where w ij represents the weight of keyword j in information i.
In the VSM construction process, firstly, the dimension m of a keyword set is determined, the keywords are used for representing the characteristics of the document, and when the number of the keywords is increased, the time complexity is increased along with the increase of m. On the premise of ensuring the characterization effect, in order to reduce time overhead, the embodiment extracts the first 5 keywords in each piece of information to characterize the piece of information (generally takes 3 and 5 to have the best effect), then adopts a TF-IDF algorithm to obtain the dimension m of the keyword set in the information set, and adopts the TF-IDF algorithm to calculate the weight w ij.
The process of calculating the weight w ij by the TF-IDF algorithm includes that the calculation of the TF-IDF algorithm can be divided into a word frequency (TF) part and an Inverse Document Frequency (IDF) part, and the products of the two parts jointly determine the weight of the document words.
Wherein the calculation formula of TF is:
Where count (i, j) represents the frequency of the keyword i in the information document j, and size (j) represents the total number of information j.
The IDF calculation formula is:
n represents the total number of information sets, and N (i) represents the number of information in which the keyword i appears.
The weights are calculated by TF and IDF as:
wij=TF(i,j)*IDF(i)
W ij after the weights are processed by using a normalization mode:
in one embodiment, the inventors have found that the conventional vsm+kmeans based recommendation method has the following drawbacks:
1) Recommendation algorithms, typically based on collaborative filtering, make recommendations based on user browsing records or feedback records. In addition, some methods consider adding information category factors, but the information categories are manually classified by information editors, and classification standards only represent edited opinions, so that recommendation effect deviation is caused.
2) The K-means algorithm is used for clustering, and has the limitations that firstly, the number of clusters is preset in the algorithm, but the accurate number of clusters is difficult to give in practical application, reference of the number of clusters is not selected for different data sets, a large number of training experiments are needed, and secondly, the initial cluster center of the algorithm is obtained in a random mode, if the initial center position selection is unsuitable, the operation amount is likely to be increased, and the global optimal solution is not obtained.
Therefore, the maximum and minimum distance clustering algorithm is earliest used in the field of pattern recognition, and samples with the greatest distance as far as possible are clustered by probing Euclidean distance between clusters, so that the situation that clustering results are poor due to too close initial center selection can be effectively avoided, the number of clusters hopefully generated is naturally increased after the initial cluster center selection is completed, and the defect of the number of unknown classes in K-means clustering is overcome.
Specifically, clustering the item portraits of the information browsed by the user through AMMK-means, and taking the clustering result as the information category of interest of the user, wherein the clustering result comprises the following steps:
generating a dataset based on the representation of the item of information that the user has browsed;
determining a clustering center and the number of the clustering centers by using a maximum and minimum distance clustering algorithm for samples in the data set;
Taking the number of the clustering centers as a K value in a K-means algorithm, and taking all the obtained clustering centers as initial clustering centers in the K-means clustering algorithm;
Based on the distance between each sample in the dataset and each initial clustering center, a clustering result is obtained when the set constraint condition is met;
and taking the clustering result as the information category of interest to the user.
Specifically, the determining the clustering center and the number of the clustering centers for the samples in the dataset by using a maximum and minimum distance clustering algorithm includes:
calculating the average value of the sample attributes, calculating the distance between each sample and the average value, and taking the sample corresponding to the minimum value of the distance as a first clustering center C 1;
Selecting a sample farthest from C 1 as a second cluster center C 2;
Calculating distances D i1 and D i2 from all remaining samples to C 1 and C 2, if D l=max{min(Di1,Di2), i=1, 2,..n }, and D l>θD12, θ is a given value, D 12 is the distance between C 1 and C 2, taking x l as a third cluster center C 3;
if C 3 is present, calculate D j=max{min(Di1,Di2,Di3), i=1, 2,..n, if D j>θD12, establish a fourth cluster center;
and analogically, ending the calculation of searching the cluster centers until the maximum and minimum distances are not more than thetad 12, and obtaining the cluster centers and the number of the cluster centers.
The clustering algorithm does not need to set the clustering number, does not need to pre-estimate the clustering number by using a large number of experiments, has smaller calculated amount, optimizes the selection of an initial clustering center, and avoids the defect of obtaining a local optimal solution of the initial clustering center of the conventional algorithm in a random mode.
In one embodiment, the modified algorithm AMMK-means referring to FIG. 2, FIG. 2 is a flow chart of the main steps of the modified algorithm AMMK-means according to an embodiment of the present invention. The specific steps of the improved algorithm AMMK-means are as follows:
For a given dataset x= { X 1,X2,...,Xn }:
Step 1, calculating a sample attribute average value, calculating the distance between each sample and the average value, and taking a sample corresponding to the minimum distance value as a first clustering center C 1;
step 2, giving theta, wherein 0< theta <1;
Step 3, selecting a sample point corresponding to D i1 farthest from C 1 as a second clustering center C 2;
Step 4, calculating the distances D i1 and D i2 from the sample to C 1 and C 2, if D l=max{min(Di1,Di2), i=1, 2,..n }, and D l>θD12,D12 is the distance between C 1 and C 2, taking x l as the third cluster center C 3;
Step 5, if C 3 is present, calculate D j=max{min(Di1,Di2,Di3), i=1, 2. If D j>θD12 is not greater than θD 12, continuing to find and establish a cluster center, and so on;
Step 6, taking the number of the cluster centers obtained in the steps 3, 4 and 5 as a K value in a K-means algorithm, and taking all the obtained cluster centers as initial cluster centers in the K-means clustering algorithm;
step 7, calculating the distance between each sample in the sample set and the center of the cluster, distributing the rest samples into the centroid cluster closest to the rest samples according to the nearest principle of the distance, and updating the centroid of each cluster;
and 8, repeating the step 7 until the error square sum criterion is met and the objective function is minimized to finish clustering, and obtaining a clustering result.
The objective function in the above step 8 is as follows:
Where p represents the data object, C i represents the centroid, and J c represents the sum of all the object squared errors in the data set.
The AMMK-means algorithm in the method provided by the embodiment does not need cosine to preset the number of clusters during clustering, does not need to pre-estimate the number of clusters by using a large number of experiments in advance, has smaller calculated amount, and has proper selection of the initial cluster center, thereby avoiding the defects that the addition amount is caused by the random acquisition of the initial cluster center and the global optimal solution is not obtained.
In the existing clustering process based on Chamelon algorithm, on one hand, when constructing a K nearest neighbor graph (Gk graph), similarity between every two data points needs to be calculated, and the first K values are taken according to the order of the similarity from big to small, the K value of the K-nearest neighbor graph and the threshold value of a similarity function need to be manually given, a large amount of priori knowledge is needed for giving the parameters, and the difficulty is high. The clustering algorithm provided by the embodiment avoids the need of manually giving the K value of the K-nearest neighbor graph and the threshold value of the similarity function when constructing the G K graph, avoids a great deal of priori knowledge required by giving the parameters, and reduces the difficulty. On the other hand, when dividing the G K graph into unconnected sub-graphs and taking the unconnected sub-graphs as an initial cluster of clustering, dividing the G K graph into two approximately equal sub-graphs according to a minimum truncation principle, and then taking the sub-graphs obtained by dividing as the initial cluster, and continuously repeating the previous process until the dividing standard is reached, thereby completing the dividing process. The partitioning technique employed in partitioning the G K graph increases the complexity of the algorithm and makes the minimum binary selection used difficult. The clustering method provided by the embodiment reduces the algorithm complexity and avoids the difficulty of using minimum binary selection.
In one embodiment, the user builds a user interest model through the browsed information, wherein the first layer of nodes of the model are users, the second layer of nodes are information categories, and the third layer of nodes are battlefield information browsed by the users.
If the user browses m different battlefield information, the user interest model may be expressed as:
seat={(T1,w1,n1),...,(Tm,wm,nm)}。
Where T i represents a feature vector of the i-th information category, w i represents a weight of the i-th information category, and n i represents the number of information categories included in the i-th information category that the user browses.
The feature vector of a certain information category is obtained from a weighted average of all browsed information feature vectors contained in the category.
The calculation formula of the feature vector T i of the i-th information category is:
Wherein E j represents a set of information browsed by the user in the information category I, E j represents an information feature vector, I j represents a user interest level of the j-th information in the category, and the information browsed by the user represents the interest of the user in the information, so that if I j is set to 1, the formula can be simplified as follows:
Further, the value of w i is calculated according to the ratio of the information browsed by the user to the total browsed information in the ith information category, and the calculation formula is as follows:
At the time of calculation, the user interest model is expressed as:
Vseat=(w1*T1,w2*T2,...,wm*Tm)T
Finally, the cosine similarity is used for calculating the similarity between the candidate battlefield information d i and the user, and the formula is calculated:
where w i*Ti T is the feature vector of the battlefield information category to which the candidate news d i belongs, Is the eigenvector of d i.
According to the method and the device for recommending the user, the interests of the user can be accurately expressed, the recommending effect is improved, and the defect that the classification standard only can represent edited opinions because the information category is obtained by manually classifying information editors when the recommendation algorithm based on collaborative filtering carries out recommendation according to the user browsing records or feedback records is overcome.
In a specific embodiment, the characteristic attribute of the information is acquired first, the information browsed by the user is analyzed to generate a user portrait, the characteristic similarity between the user portrait and the candidate information is calculated, and finally the information with high similarity is recommended to the user according to the similarity.
The content-based recommendation method generally includes three steps of item portraits, user portraits, and recommendation generation.
The object representation is characterized in that the object is represented by characteristic information, and the attribute describing the object has structured data and unstructured data, and the unstructured data needs to be converted into the structured data to be used in the model.
The current common commodity representation method is vector space model (Vector Space Model, VSM) based on TF-IDF weights.
The VSM converts the text documents into space vectors, and the TF-IDF is used to calculate keyword weights for each document. Because synonyms, word multi-meaning and the like exist among words in the document, the robustness and the accuracy of the recommendation model are reduced. In order to enhance the generalization ability of the model to the problems of word ambiguity and synonyms, semantic analysis and knowledge graph are applied to the recommendation system.
The user portrayal is based on the characteristics that the user has browsed or rated before to construct the user interest model. The model mainly comprises two parts of text classification and a user interest model for constructing a hierarchical structure, namely, firstly, clustering object portraits of information browsed by a user to obtain information categories and corresponding features (namely, object portraits) of the information categories, secondly, calculating weights of the information categories, and finally, counting the number of the information browsed by the user contained in the information categories. Traditional text classification models include nearest neighbor algorithms, rocchio algorithms, decision tree methods, linear classification methods, bayesian classifiers, and the like. The user interest hierarchical model construction process comprises a hierarchical model of a three-layer structure of user-category-object or a hierarchical model of a three-layer structure of user-interest-item. The recommendation is to recommend a group of goods set with highest correlation to the user by comparing the characteristic similarity of the user portrait and the candidate goods, and the common similarity calculation method comprises two methods of pearson correlation coefficient and cosine similarity.
The image of the object in this embodiment is a series of labels for each object. One of the item representations may be used as an item feature in the recommendation model. In the recommendation system, the item representation is the basis of the user representation, and the item representation+the user behavior=the user representation.
It should be noted that, although the foregoing embodiments describe the steps in a specific order, it will be understood by those skilled in the art that, in order to achieve the effects of the present invention, the steps are not necessarily performed in such an order, and may be performed simultaneously (in parallel) or in other orders, and these variations are within the scope of the present invention.
Based on the same inventive concept, as shown in fig. 3, the present invention further provides an information recommendation system based on VSM and AMMK-means, including:
The acquisition module is used for acquiring the object portraits of each piece of candidate information;
the similarity calculation module is used for substituting the item portraits of the candidate information into a pre-constructed interest model to obtain the similarity between the candidate information and the user portraits;
the recommending module is used for recommending the candidate information with the highest similarity to the user;
the interest model is constructed based on the VSM, AMMK-means and the item representation of the information that the user has browsed.
In an embodiment, the system further comprises a building module of an interest model, wherein the building module of the interest model comprises:
The first construction unit is used for acquiring the article portraits of the information which the user has browsed and characterizing the article portraits of the information which the user has browsed by utilizing the VSM;
the information category construction unit is used for clustering the object portraits of the information browsed by the user through AMMK-means, and taking the clustering result as the information category of interest to the user;
a weight calculation unit for calculating the weight of each information category according to the number of information browsed by the user and the total number of information browsed by the user contained in each information category;
A user portrait construction unit, which is used for generating a user portrait based on the information category of interest of the user and the weight of the information category;
and the calculating unit is used for calculating the similarity between the article portrait of the candidate information and the article portrait in the user portrait.
The maximum and minimum distance method adopted by the embodiment is based on Euclidean distance, and objects which are as far as possible are taken as clustering centers, so that the situation that the clustering centers are too close to each other possibly occurring during the initial value selection of the K-means method is avoided, the number of initial clustering centers is intelligently determined, and the efficiency of dividing the initial data set is improved.
The information category construction unit in the embodiment is specifically configured to:
generating a dataset based on the representation of the item of information that the user has browsed;
determining a clustering center and the number of the clustering centers by using a maximum and minimum distance clustering algorithm for samples in the data set;
Taking the number of the clustering centers as a K value in a K-means algorithm, and taking all the obtained clustering centers as initial clustering centers in the K-means clustering algorithm;
Based on the distance between each sample in the dataset and each initial clustering center, a clustering result is obtained when the set constraint condition is met;
and taking the clustering result as the information category of interest to the user.
In an embodiment, determining the cluster center and the number of cluster centers for the samples in the dataset by using a maximum-minimum distance clustering algorithm includes:
calculating the average value of the sample attributes, calculating the distance between each sample and the average value, and taking the sample corresponding to the minimum value of the distance as a first clustering center C 1;
Selecting a sample farthest from C 1 as a second cluster center C 2;
Calculating distances D i1 and D i2 from all remaining samples to C 1 and C 2, if D l=max{min(Di1,Di2), i=1, 2,..n }, and D l>θD12, θ is a given value, D 12 is the distance between C 1 and C 2, taking x l as a third cluster center C 3;
if C 3 is present, calculate D j=max{min(Di1,Di2,Di3), i=1, 2,..n, if D j>θD12, establish a fourth cluster center;
and analogically, ending the calculation of searching the cluster centers until the maximum and minimum distances are not more than thetad 12, and obtaining the cluster centers and the number of the cluster centers.
In an embodiment, the expression of the interest model is as follows:
Vseat=(w1*T1,w2*T2,...,wm*Tm)T
Wherein V seat represents a user image, w m represents a weight of the mth information category, and T m represents a feature vector of the mth information category.
In an embodiment, the similarity is calculated as follows:
Wherein, the seal is the item portrait in the user portrait, w i is the weight of the information category to which the candidate information d i belongs, T i T is the feature vector of the information category to which the candidate information d i belongs, w i*Ti T is the feature vector of the information category to which the candidate information d i belongs, Is the eigenvector of d i.
It will be appreciated by those skilled in the art that the present invention may implement all or part of the above-described methods according to the above-described embodiments, or may be implemented by means of a computer program for instructing relevant hardware, where the computer program may be stored in a computer readable storage medium, and where the computer program may implement the steps of the above-described embodiments of the method when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include any entity or device capable of carrying the computer program code, a medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, an electrical carrier wave signal, a telecommunication signal, a software distribution medium, etc. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
Further, the invention also provides a storage device. In one storage device embodiment according to the present invention, the storage device may be configured to store a program for performing the information recommendation method based on the VSM and the AMMK-means of the above method embodiment, which may be loaded and executed by the processor to implement the information recommendation method based on the VSM and the AMMK-means. For convenience of explanation, only those portions of the embodiments of the present invention that are relevant to the embodiments of the present invention are shown, and specific technical details are not disclosed, please refer to the method portions of the embodiments of the present invention. The storage means may be a storage means device formed by including various electronic devices, and optionally, a non-transitory computer readable storage medium is stored in an embodiment of the present invention.
Further, the invention also provides a control device. In one control device embodiment according to the present invention, the control device includes a processor and a storage device, the storage device may be configured to store a program for executing the information recommendation method based on the VSM and the AMMK-means of the above-described method embodiment, and the processor may be configured to execute the program in the storage device, including, but not limited to, the program for executing the information recommendation method based on the VSM and the AMMK-means of the above-described method embodiment. For convenience of explanation, only those portions of the embodiments of the present invention that are relevant to the embodiments of the present invention are shown, and specific technical details are not disclosed, please refer to the method portions of the embodiments of the present invention. The control device may be a control device formed of various electronic devices.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the specific embodiments of the present invention without departing from the spirit and scope of the present invention, and any modifications and equivalents are intended to be included in the scope of the claims of the present invention.

Claims (4)

1.一种基于VSM和AMMK-means的信息推荐方法,其特征在于,包括:1. An information recommendation method based on VSM and AMMK-means, characterized by comprising: 获取各候选信息的物品画像;Obtaining item portraits of each candidate information; 将所述各候选信息的物品画像代入预先构建的兴趣模型,得到所述各候选信息与用户画像的相似度;Substituting the item portrait of each candidate information into a pre-built interest model to obtain the similarity between each candidate information and the user portrait; 将所述相似度最高的候选信息推荐给用户;Recommending the candidate information with the highest similarity to the user; 所述兴趣模型为,基于VSM和AMMK-means以及用户已经浏览过的信息的物品画像进行构建;The interest model is constructed based on VSM and AMMK-means and the item portraits of the information that the user has browsed; 所述兴趣模型的构建包括:The construction of the interest model includes: 获取用户已经浏览过的信息的物品画像,并利用VSM表征用户已经浏览过的信息的物品画像;Obtain the item portraits of the information that the user has browsed, and use VSM to represent the item portraits of the information that the user has browsed; 通过AMMK-means对所述用户已经浏览过的信息的物品画像进行聚类,将聚类结果作为用户感兴趣的信息类别;Clustering the item portraits of the information that the user has browsed through AMMK-means, and using the clustering results as the information category that the user is interested in; 根据各信息类别中包含的用户浏览过的信息数量和用户已经浏览过的信息总数量分别计算各信息类别的权重;The weight of each information category is calculated according to the number of information browsed by the user contained in each information category and the total number of information browsed by the user; 基于用户感兴趣的信息类别和所述信息类别的权重生成用户画像;Generate a user profile based on information categories that the user is interested in and weights of the information categories; 计算候选信息的物品画像与所述用户画像中物品画像的相似度;Calculate the similarity between the item image of the candidate information and the item image in the user image; 所述通过AMMK-means对所述用户已经浏览过的信息的物品画像进行聚类,将聚类结果作为用户感兴趣的信息类别,包括:The clustering of the item portraits of the information that the user has browsed by AMMK-means and taking the clustering results as the information category that the user is interested in includes: 基于用户已经浏览过的信息的物品画像生成数据集;Generate a dataset based on the item portraits of the information that the user has browsed; 对所述数据集中的样本利用最大最小距离聚类算法确定聚类中心以及聚类中心的个数;Determine the cluster center and the number of cluster centers by using the maximum and minimum distance clustering algorithm for the samples in the data set; 将所述聚类中心个数作为K-means算法中的K值,并将得到的所有聚类中心作为K-means聚类算法中的初始聚类中心;The number of cluster centers is used as the K value in the K-means algorithm, and all the obtained cluster centers are used as the initial cluster centers in the K-means clustering algorithm; 基于数据集中每个样本与各初始聚类中心的距离,当满足设定的约束条件时得到聚类结果;Based on the distance between each sample in the data set and each initial cluster center, the clustering result is obtained when the set constraints are met; 将聚类结果作为用户感兴趣的信息类别;The clustering results are used as information categories that users are interested in; 所述对所述数据集中的样本利用最大最小距离聚类算法确定聚类中心以及聚类中心的个数,包括:Determining the cluster centers and the number of cluster centers by using the maximum and minimum distance clustering algorithm for the samples in the data set includes: 计算样本属性平均值,计算各个样本与平均值之间的距离,将距离最小值对应的样本作为第一个聚类中心C1Calculate the average value of sample attributes, calculate the distance between each sample and the average value, and take the sample corresponding to the minimum distance as the first cluster center C 1 ; 选择距C1距离最远的样本作为第二个聚类中心C2Select the sample farthest from C 1 as the second cluster center C 2 ; 计算其余所有样本到C1和C2的距离Di1和Di2,如果Dl=max{min(Di1,Di2),i=1,2,...n},且Dl>θD12,θ是给定值,D12是C1和C2之间的距离,则取xl作为第三个聚类中心C3Calculate the distances D i1 and D i2 from all remaining samples to C 1 and C 2. If D l = max{min(D i1 , D i2 ), i = 1, 2, ... n}, and D l > θD 12 , θ is a given value, D 12 is the distance between C 1 and C 2 , then take x l as the third cluster center C 3 ; 如果C3存在,则计算Dj=max{min(Di1,Di2,Di3),i=1,2,...n,如果Dj>θD12则建立第四个聚类中心;If C 3 exists, calculate D j = max{min(D i1 , D i2 , D i3 ), i = 1, 2, ... n, if D j > θ D 12 , establish the fourth cluster center; 依次类推,直到最大最小距离不大于θD12结束寻找聚类中心的计算,得到聚类中心以及聚类中心的个数;And so on, until the maximum and minimum distances are no greater than θD 12 , the calculation of finding the cluster center is finished, and the cluster center and the number of cluster centers are obtained; 所述兴趣模型的表达式如下式所示:The expression of the interest model is shown in the following formula: 式中:Vseat表示用户画像;wm表示第m个信息类别的权重;Tm表示第m个信息类别的特征向量;Where: V seat represents the user portrait; w m represents the weight of the mth information category; T m represents the feature vector of the mth information category; 所述相似度按下式计算:The similarity is calculated as follows: 式中:seat为用户画像中的物品画像,wi为候选信息di所属信息类别的权重,Ti T为候选信息di所属信息类别的特征向量,为di的特征向量。Where: seat is the item portrait in the user portrait, wi is the weight of the information category to which the candidate information di belongs, TiT is the feature vector of the information category to which the candidate information di belongs, is the eigenvector of d i . 2.一种基于VSM和AMMK-means的信息推荐系统,其特征在于,包括:2. An information recommendation system based on VSM and AMMK-means, characterized by comprising: 获取模块,用于获取各候选信息的物品画像;An acquisition module is used to obtain an item image of each candidate information; 计算相似度模块,用于将所述各候选信息的物品画像代入预先构建的兴趣模型,得到所述各候选信息与用户画像的相似度;A similarity calculation module is used to substitute the item portrait of each candidate information into a pre-built interest model to obtain the similarity between each candidate information and the user portrait; 推荐模块,用于将所述相似度最高的候选信息推荐给用户;A recommendation module, used for recommending the candidate information with the highest similarity to the user; 所述兴趣模型为,基于VSM和AMMK-means以及用户已经浏览过的信息的物品画像进行构建;The interest model is constructed based on VSM and AMMK-means and the item portraits of the information that the user has browsed; 所述系统还包括兴趣模型的构建模块;所述兴趣模型的构建模块包括:The system further includes a construction module of an interest model; the construction module of the interest model includes: 第一构建单元,用于获取用户已经浏览过的信息的物品画像,并利用VSM表征用户已经浏览过的信息的物品画像;The first construction unit is used to obtain the item portraits of the information that the user has browsed, and use VSM to represent the item portraits of the information that the user has browsed; 信息类别构建单元,用于通过AMMK-means对所述用户已经浏览过的信息的物品画像进行聚类,将聚类结果作为用户感兴趣的信息类别;An information category construction unit, used for clustering the item portraits of the information that the user has browsed through AMMK-means, and using the clustering results as the information category that the user is interested in; 权重计算单元,用于根据各信息类别中包含的用户浏览过的信息数量和用户已经浏览过的信息总数量分别计算各信息类别的权重;A weight calculation unit, used to calculate the weight of each information category according to the number of information browsed by the user contained in each information category and the total number of information browsed by the user; 用户画像构建单元,用于基于用户感兴趣的信息类别和所述信息类别的权重生成用户画像;A user portrait building unit, used to generate a user portrait based on the information categories that the user is interested in and the weights of the information categories; 计算单元,用于计算候选信息的物品画像与所述用户画像中物品画像的相似度;A calculation unit, used to calculate the similarity between the item portrait of the candidate information and the item portrait in the user portrait; 所述信息类别构建单元,具体包括:The information category construction unit specifically includes: 基于用户已经浏览过的信息的物品画像生成数据集;Generate a dataset based on the item portraits of the information that the user has browsed; 对所述数据集中的样本利用最大最小距离聚类算法确定聚类中心以及聚类中心的个数;Determine the cluster center and the number of cluster centers by using the maximum and minimum distance clustering algorithm for the samples in the data set; 将所述聚类中心个数作为K-means算法中的K值,并将得到的所有聚类中心作为K-means聚类算法中的初始聚类中心;The number of cluster centers is used as the K value in the K-means algorithm, and all the obtained cluster centers are used as the initial cluster centers in the K-means clustering algorithm; 基于数据集中每个样本与各初始聚类中心的距离,当满足设定的约束条件时得到聚类结果;Based on the distance between each sample in the data set and each initial cluster center, the clustering result is obtained when the set constraints are met; 将聚类结果作为用户感兴趣的信息类别;The clustering results are used as information categories that users are interested in; 所述信息类别构建单元中的对所述数据集中的样本利用最大最小距离聚类算法确定聚类中心以及聚类中心的个数,具体包括:The information category construction unit determines the cluster centers and the number of cluster centers by using the maximum and minimum distance clustering algorithm for the samples in the data set, specifically including: 计算样本属性平均值,计算各个样本与平均值之间的距离,将距离最小值对应的样本作为第一个聚类中心C1Calculate the average value of sample attributes, calculate the distance between each sample and the average value, and take the sample corresponding to the minimum distance as the first cluster center C 1 ; 选择距C1距离最远的样本作为第二个聚类中心C2Select the sample farthest from C 1 as the second cluster center C 2 ; 计算其余所有样本到C1和C2的距离Di1和Di2,如果Dl=max{min(Di1,Di2),i=1,2,...n},且Dl>θD12,θ是给定值,D12是C1和C2之间的距离,则取xl作为第三个聚类中心C3Calculate the distances D i1 and D i2 from all remaining samples to C 1 and C 2. If D l = max{min(D i1 , D i2 ), i = 1, 2, ... n}, and D l > θD 12 , θ is a given value, D 12 is the distance between C 1 and C 2 , then take x l as the third cluster center C 3 ; 如果C3存在,则计算Dj=max{min(Di1,Di2,Di3),i=1,2,...n,如果Dj>θD12则建立第四个聚类中心;If C 3 exists, calculate D j = max{min(D i1 , D i2 , D i3 ), i = 1, 2, ... n, if D j > θ D 12 , establish the fourth cluster center; 依次类推,直到最大最小距离不大于θD12结束寻找聚类中心的计算,得到聚类中心以及聚类中心的个数;And so on, until the maximum and minimum distances are no greater than θD 12 , the calculation of finding the cluster center is finished, and the cluster center and the number of cluster centers are obtained; 所述兴趣模型的表达式如下式所示:The expression of the interest model is shown in the following formula: Vseat=(w1*T1,w2*T2,…,wm*Tm)T V seat =(w 1 *T 1 ,w 2 *T 2 ,…,w m *T m ) T 式中:Vseat表示用户画像;wm表示第m个信息类别的权重;Tm表示第m个信息类别的特征向量;Where: V seat represents the user portrait; w m represents the weight of the mth information category; T m represents the feature vector of the mth information category; 所述相似度按下式计算:The similarity is calculated as follows: 式中:seat为用户画像中的物品画像,wi为候选信息di所属信息类别的权重,Ti T为候选信息di所属信息类别的特征向量,为di的特征向量。Where: seat is the item portrait in the user portrait, wi is the weight of the information category to which the candidate information di belongs, TiT is the feature vector of the information category to which the candidate information di belongs, is the eigenvector of d i . 3.一种存储装置,其中存储有多条程序代码,其特征在于,所述程序代码适于由处理器加载并运行以执行权利要求1中所述的基于VSM和AMMK-means的信息推荐方法。3. A storage device storing a plurality of program codes, characterized in that the program codes are suitable for being loaded and run by a processor to execute the information recommendation method based on VSM and AMMK-means described in claim 1. 4.一种控制装置,包括处理器和存储装置,所述存储装置适于存储多条程序代码,其特征在于,所述程序代码适于由所述处理器加载并运行以执行权利要求1中所述的基于VSM和AMMK-means的信息推荐方法。4. A control device, comprising a processor and a storage device, wherein the storage device is suitable for storing multiple program codes, characterized in that the program codes are suitable for being loaded and run by the processor to execute the information recommendation method based on VSM and AMMK-means described in claim 1.
CN202011432407.3A 2020-12-10 2020-12-10 Information recommendation method and system based on VSM and AMMK-means Active CN114625952B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011432407.3A CN114625952B (en) 2020-12-10 2020-12-10 Information recommendation method and system based on VSM and AMMK-means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011432407.3A CN114625952B (en) 2020-12-10 2020-12-10 Information recommendation method and system based on VSM and AMMK-means

Publications (2)

Publication Number Publication Date
CN114625952A CN114625952A (en) 2022-06-14
CN114625952B true CN114625952B (en) 2025-07-18

Family

ID=81896053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011432407.3A Active CN114625952B (en) 2020-12-10 2020-12-10 Information recommendation method and system based on VSM and AMMK-means

Country Status (1)

Country Link
CN (1) CN114625952B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118016249B (en) * 2024-02-19 2024-09-10 中山大学孙逸仙纪念医院 Preoperative anxiety relieving method and system based on virtual reality technology

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532306A (en) * 2019-05-27 2019-12-03 浙江工业大学 A kind of Library User's portrait model building method dividing k-means based on multi-angle of view two
CN110781963A (en) * 2019-10-28 2020-02-11 西安电子科技大学 K-means clustering-based aerial target clustering method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376057A (en) * 2014-11-06 2015-02-25 南京邮电大学 Self-adaptation clustering method based on maximum distance, minimum distance and K-means
CN105678607B (en) * 2016-01-07 2019-05-31 合肥工业大学 A kind of Order Batch method based on improved K-Means algorithm
CN107645393A (en) * 2016-07-20 2018-01-30 中兴通讯股份有限公司 Determine the method, apparatus and system of the black-box system input and output degree of association
WO2020232616A1 (en) * 2019-05-20 2020-11-26 深圳市欢太科技有限公司 Information recommendation method and apparatus, and electronic device and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532306A (en) * 2019-05-27 2019-12-03 浙江工业大学 A kind of Library User's portrait model building method dividing k-means based on multi-angle of view two
CN110781963A (en) * 2019-10-28 2020-02-11 西安电子科技大学 K-means clustering-based aerial target clustering method

Also Published As

Publication number Publication date
CN114625952A (en) 2022-06-14

Similar Documents

Publication Publication Date Title
CN110162706B (en) Personalized recommendation method and system based on interactive data clustering
US20230039496A1 (en) Question-and-answer processing method, electronic device and computer readable medium
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
USRE47340E1 (en) Image retrieval apparatus
CA2786727C (en) Joint embedding for item association
CN111581354A (en) A method and system for calculating similarity of FAQ questions
CN101996191B (en) Method and system for searching for two-dimensional cross-media element
TWI396105B (en) Digital data processing method for personalized information retrieval and computer readable storage medium and information retrieval system thereof
CN114298020B (en) Keyword vectorization method based on topic semantic information and application thereof
CN110083764A (en) A kind of collaborative filtering cold start-up way to solve the problem
CN113704617A (en) Article recommendation method, system, electronic device and storage medium
CN114936278A (en) Text recommendation method, apparatus, computer equipment and storage medium
Niu Music Emotion Recognition Model Using Gated Recurrent Unit Networks and Multi‐Feature Extraction
CN114625952B (en) Information recommendation method and system based on VSM and AMMK-means
CN118069814B (en) Text processing method, device, electronic equipment and storage medium
TW201243627A (en) Multi-label text categorization based on fuzzy similarity and k nearest neighbors
Eyjolfsdottir et al. Moviegen: A movie recommendation system
Spiegel et al. Pattern recognition in multivariate time series: dissertation proposal
CN117972359A (en) Intelligent data analysis method based on multi-mode data
Wang Application of E-Commerce Recommendation Algorithm in Consumer Preference Prediction
CN113505223A (en) Network water army identification method and system
Ha et al. Ordered Clustering-Based Semantic Music Recommender System Using Deep Learning Selection.
Tu Online Text Retrieval Method Based on Convolution Neural Network.
CN119646191B (en) Automatic labeling method, device and equipment based on large model and clustering algorithm
CN111581164B (en) Multimedia file processing method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant