[go: up one dir, main page]

CN117708384A - Graph data storage method, device, equipment and storage medium based on JanusGraph - Google Patents

Graph data storage method, device, equipment and storage medium based on JanusGraph Download PDF

Info

Publication number
CN117708384A
CN117708384A CN202410136687.5A CN202410136687A CN117708384A CN 117708384 A CN117708384 A CN 117708384A CN 202410136687 A CN202410136687 A CN 202410136687A CN 117708384 A CN117708384 A CN 117708384A
Authority
CN
China
Prior art keywords
data
graph
necessary
graph data
full
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410136687.5A
Other languages
Chinese (zh)
Other versions
CN117708384B (en
Inventor
周旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongdian Cloud Computing Technology Co ltd
Original Assignee
Zhongdian Cloud Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongdian Cloud Computing Technology Co ltd filed Critical Zhongdian Cloud Computing Technology Co ltd
Priority to CN202410136687.5A priority Critical patent/CN117708384B/en
Publication of CN117708384A publication Critical patent/CN117708384A/en
Application granted granted Critical
Publication of CN117708384B publication Critical patent/CN117708384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a graph data storage method, device and equipment based on JanusGraph and a storage medium, wherein the method comprises the following steps: extracting source data and converting the source data into necessary diagram data and full-scale diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-scale diagram data comprises full-scale attribute information of the diagram data model; writing necessary graph data into a columnar database through JanusGraph, wherein the data structure of the columnar database is defined according to necessary attributes of a graph data model; and writing the full-volume graph data into a search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model. According to the method and the device, on the premise that normal operation of the service is not affected, data redundancy is reduced, and complexity of writing of JanusGraph and writing performance cost are reduced, so that data writing performance of JanusGraph is improved.

Description

Graph data storage method, device, equipment and storage medium based on JanusGraph
Technical Field
The present application relates to the field of graph data storage technologies, and in particular, to a graph data storage method, device, equipment and storage medium based on janus graph.
Background
The large-scale distributed graphic column database is a system designed for processing large-scale graphic data sets, and JanusGraph, tigerGraph is the main stream of large-scale distributed graphic column databases in the market at present. The distributed architecture of JanusGraph and the property of the support attribute map model make it suitable for various application fields. Such as social network analysis, network security analysis, intelligent transportation systems, etc. Although, janus graph performs excellently in scenes that handle complex relationships and large-scale graph data.
However, challenges are also faced when handling very large data sets, especially when frequent write operations are involved. The design of JanusGraph requires that the data be stored in a column database in one portion and in the search engine in one portion. This design, while providing efficient query performance, also results in data redundancy, as the same data is shared between the columnar database and the search engine. Moreover, the need to ensure consistency of data in the columnar databases and search engines during writing increases the complexity of writing and the write performance overhead. Therefore, when a large-scale data set janus graph database writing operation is required, there is a problem that writing performance is low.
Disclosure of Invention
The application provides a graph data storage method, device, equipment and storage medium based on JanusGraph, which can solve the technical problem of low JanusGraph data writing performance in the prior art.
In a first aspect, an embodiment of the present application provides a graph data storage method based on janus graph, where the graph data storage method based on janus graph includes:
extracting source data and converting the source data into necessary diagram data and full-scale diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-scale diagram data comprises full-scale attribute information of the diagram data model;
writing necessary graph data into a columnar database through JanusGraph, wherein the data structure of the columnar database is defined according to necessary attributes of a graph data model;
and writing the full-volume graph data into a search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
Further, in an embodiment, before the step of extracting the source data and converting the source data into the necessary map data and the full map data, the method further includes:
defining a graph data model based on user operation, and generating metadata of the graph data model, wherein the graph data model identifies necessary attributes in a service scene;
and defining the data structures of the columnar database and the search engine according to the metadata of the graph data model through JanusGraph, wherein the JanusGraph operates the search engine for each attribute and also operates the columnar database for the identified necessary attribute.
Further, in an embodiment, the step of converting the source data into the necessary map data and the full map data includes:
converting the source data into formatted data according to metadata of the graph data model, wherein the formatted data comprises the full attribute information and attribute identification information of the graph data model;
and analyzing the formatted data to obtain necessary diagram data and total diagram data, wherein attribute information is added to the total diagram data for each attribute, and attribute information is added to the necessary diagram data for the identified necessary attribute.
Further, in an embodiment, before the step of extracting the source data and converting the source data into the necessary map data and the full map data, the method further includes:
and connecting the source database according to the connection information of the source database.
Further, in an embodiment, the step of converting the source data into the necessary map data and the full map data includes:
converting the source data into necessary diagram data according to metadata of a column database;
the source data is converted into full graph data according to metadata of the search engine.
Further, in an embodiment, before the step of extracting the source data and converting the source data into the necessary map data and the full map data, the method further includes:
connecting the source database according to the connection information of the source database;
metadata of a columnar database and metadata of a search engine are obtained.
Further, in an embodiment, the columnar database is one of HBase and Cassandra, and the search engine is one of Elasticsearch, solr and Lucene.
In a second aspect, an embodiment of the present application further provides a graph data storage device based on janus graph, where the graph data storage device based on janus graph includes:
the data preparation module is used for extracting source data and converting the source data into necessary diagram data and full-quantity diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-quantity diagram data comprises full-quantity attribute information of the diagram data model;
the first writing module is used for writing necessary graph data into the column database through the JanusGraph, wherein the data structure of the column database is defined according to the necessary attribute of the graph data model;
and the second writing module is used for writing the full-volume graph data into the search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
In a third aspect, an embodiment of the present application further provides a janus graph-based graph data storage device, where the janus graph-based graph data storage device includes a processor, a memory, and a janus graph-based graph data storage program stored on the memory and executable by the processor, where the janus graph-based graph data storage program implements the steps of the above janus graph-based graph data storage method when executed by the processor.
In a fourth aspect, an embodiment of the present application further provides a storage medium, where a graph data storage program based on janus graph is stored on the storage medium, where when the graph data storage program based on janus graph is executed by a processor, the steps of the graph data storage method based on janus graph are implemented.
In the method, necessary attributes and unnecessary attributes are distinguished in a graph data model according to the use requirement of a service scene, a data structure of a column database is defined according to the necessary attributes of the graph data model, a data structure of a search engine is defined according to the total quantity attributes of the graph data model, extracted source data are converted into necessary graph data and total quantity graph data, the necessary graph data are written into the column database through JanusGraph, and the total quantity graph data are written into the search engine. According to the method and the device, on the premise that normal operation of the service is not affected, data redundancy is reduced, and complexity of writing of JanusGraph and writing performance cost are reduced, so that data writing performance of JanusGraph is improved.
Drawings
FIG. 1 is a flowchart of a graph data storage method based on JanusGraph according to an embodiment of the present application;
FIG. 2 is a timing diagram of a JanusGraph-based graph data storage method according to an embodiment of the present application;
FIG. 3 is a timing diagram of a JanusGraph-based graph data storage method according to another embodiment of the present application;
fig. 4 is a schematic hardware structure of a graph data storage device based on janus graph according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
First, some technical terms in the present application are explained so as to facilitate understanding of the present application by those skilled in the art.
JanusGraph: janusGraph is an open-source distributed graph database system, supports storage and query of large-scale graph data, can perform distributed operation on a plurality of servers, and realizes horizontal expansion. The attribute graph model is supported, allowing both nodes and edges to contain attribute information.
Attribute map: an attribute graph is a graph data model in which nodes and edges in the graph may each contain attribute information. An attribute map is a common data model in a graph database that allows attributes in the form of key-value pairs to be stored on nodes and edges.
Graph data model: data models in graph databases are generally divided into two main types: attribute graphs and triplet models. Both the attribute map model nodes and edges may contain attribute information, with attributes in the form of key-value pairs. The triplet model is composed of a series of triples, including subjects, predicates, and objects. The subject and object typically represent nodes, and the predicate represents a relationship of edges.
Dot (Vertex): the dots represent entities in the figure and may have attribute information. For example, in a social network, a point may represent a user.
Edge (Edge): the relationship between two nodes is connected by an edge, and attribute information can also be provided. For example, in a social network, one edge may represent a friend relationship between two users.
Column-wise database: the column database can efficiently store large-scale structured data and provide distributed storage and query capabilities, and mainly provides a graph data analysis function in JanusGraph.
Search engine: the search engine can efficiently construct full-text index, execute complex text query, support relevance ranking and high-performance search functions, and mainly provide full-text retrieval functions in JanusGraph.
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In a first aspect, an embodiment of the present application provides a graph data storage method based on janus graph.
Fig. 1 is a flow chart of a graph data storage method based on janus graph in an embodiment of the present application.
Referring to fig. 1, in one embodiment, the method for storing graph data based on janus graph includes the following steps:
s11, extracting source data and converting the source data into necessary diagram data and full-scale diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-scale diagram data comprises full-scale attribute information of the diagram data model.
Specifically, data extraction refers to extracting the required source data from the source database. The data conversion means that the extracted source data is converted and integrated so as to meet the requirements of a graph data model. This may involve data cleansing, format conversion, data field mapping, etc. operations to ensure that the data is suitable for writing to the map database.
Optionally, the data conversion process may employ techniques such as necessary caching, asynchronous multithreading, batch submission, bloom filter deduplication, etc. to enhance the performance of the data warehouse entry.
In this embodiment, the necessary attribute and the unnecessary attribute are distinguished in the graph data model according to the usage requirement of the service scenario, for example, the staff table has a lot of attributes such as identification card number, name, gender, birth month, age, household registration, phone, residence address, micro-signal, QQ number, mailbox, bank account number, property information, vehicle information, academic, work unit, etc. However, in the service scene of the identity card, the information of the identity card number, the name, the sex, the birth month and the household registration can meet the use requirement, so that the attributes are necessary attributes in the service scene, and the other attributes are unnecessary attributes.
In this embodiment, during data conversion, the graph data to be written into the columnar database is integrated into the necessary graph data, the graph data to be written into the search engine is integrated into the full-scale graph data, the necessary graph data and the full-scale graph data are respectively sent to JanusGraph, janusGraph, the necessary graph data is written into the columnar database, and the full-scale graph data is written into the search engine. In the prior art, only the full-volume graph data is converted, the full-volume graph data is sent to JanusGraph, janusGraph, and the full-volume graph data is written into a column database and a search engine.
The graph data model itself can be complex, containing a large number of nodes and edges, and attributes between them. Data consistency needs to be ensured at the time of writing, which may involve updating of multiple nodes and edges. And the graph database emphasizes relationships between nodes, write operations typically involve the creation, updating, or deletion of multiple nodes and edges. Such complexity may increase the overhead of the write operation. According to the scheme, the data written into the columnar database is reduced, so that the writing complexity and writing performance cost of JanusGraph are reduced, and the data writing performance is improved. The necessary diagram data written into the column database can meet the use requirement in the service scene, and ensure the normal operation of the service.
And S12, writing the necessary graph data into a column database through JanusGraph, wherein the data structure of the column database is defined according to the necessary attribute of the graph data model.
Specifically, after the necessary diagram data is sent to the janus graph, a native write interface of the columnar database is called to write the necessary diagram data into the columnar database. Before the necessary graph data is written into the column database for the first time, the data structure of the column database needs to be defined according to the necessary attributes of the graph data model, that is, the data model modeling is performed in the column database.
And S13, writing the full-volume graph data into a search engine through JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
Specifically, after the full graph data is sent to the janus graph, a native write interface of the search engine is called to write the full graph data into the search engine. Before writing the full-scale graph data to the search engine for the first time, the data structure of the search engine needs to be defined according to the full-scale attribute of the graph data model, that is, the modeling of the data model is performed in the search engine.
In this embodiment, the necessary attribute and the unnecessary attribute are distinguished in the graph data model according to the usage requirement of the service scene, the data structure of the column database is defined according to the necessary attribute of the graph data model, the data structure of the search engine is defined according to the full-scale attribute of the graph data model, the extracted source data is converted into the necessary graph data and the full-scale graph data, the necessary graph data is written into the column database through the janus graph, and the full-scale graph data is written into the search engine. By the embodiment, the data redundancy is reduced on the premise of not influencing the normal operation of the service, and the writing complexity and writing performance cost of the JanusGraph are reduced, so that the data writing performance of the JanusGraph is improved.
Optionally, the columnar database is one of HBase and Cassandra, and the search engine is one of Elasticsearch, solr and Lucene.
Further, in an embodiment, the column database is HBase, the search engine is an elastic search, and the nodes are stored in the HBase in the form of a row key, where each row corresponds to a node, and the column family is used to store the attributes of the nodes. Edges are also stored in the HBase in the form of a row key, where each row corresponds to an edge and a column family is used to store the attributes of the edges. The nodes and the edges are stored in the elastic search in the form of documents, each document corresponds to one node or edge, and the documents contain attribute information of the node or edge. The data is associated by node and edge IDs in HBase and document IDs in elastic search.
Further, in an embodiment, before step S11, the method further includes:
defining a graph data model based on user operation, and generating metadata of the graph data model, wherein the graph data model identifies necessary attributes in a service scene;
and defining the data structures of the columnar database and the search engine according to the metadata of the graph data model through JanusGraph, wherein the JanusGraph operates the search engine for each attribute and also operates the columnar database for the identified necessary attribute.
In this embodiment, when the user defines the graph data model, not only the graph structure is designed, but also the necessary attributes need to be identified in the attributes of the points and edges. And the user operates on the interactive interface, and metadata of the graph data model is automatically generated after the operation is completed. After the metadata of the graph data model is sent to the JanusGraph, a user-defined graph data modeling interface is called to perform data model modeling in a column database and a search engine.
The self-defined interface is different from the original interface in that the original interface is completely consistent with the data model established in the columnar database and the search engine, the self-defined interface can distinguish necessary attributes and unnecessary attributes according to attribute identification, and only the search engine is modeled for the unnecessary attributes, and the columnar database is not modeled, so that different data models are established in the columnar database and the search engine.
It should be noted that, in this solution, there is only one graph data model defined by the user operation, and the data models of the columnar database and the search engine are abstract representations of their data structures. Because the columnar databases and search engines organize and store data differently, their data structures are different even though the data models of the two are the same.
Further, in an embodiment, the step of converting the source data into the necessary map data and the full map data includes:
converting the source data into formatted data according to metadata of the graph data model, wherein the formatted data comprises the full attribute information and attribute identification information of the graph data model;
and analyzing the formatted data to obtain necessary diagram data and total diagram data, wherein attribute information is added to the total diagram data for each attribute, and attribute information is added to the necessary diagram data for the identified necessary attribute.
In this embodiment, the situation of the map structure setting and the attribute identifier can be known from the metadata of the map data model, and the formatted data obtained by conversion according to the situation is not only the map data containing the full amount of attribute information, but also the attribute identifier information. When the formatted data is analyzed, only the attribute information corresponding to the identified necessary attribute is added into the necessary diagram data, and all the attribute information corresponding to the attribute is added into the full-scale diagram data, so that the necessary diagram data and the full-scale diagram data are integrated.
Further, in an embodiment, before step S11, the method further includes:
and connecting the source database according to the connection information of the source database.
In this embodiment, the source database needs to be connected before the data extraction operation, and the data conversion operation needs to be based on the metadata of the graph data model. If the corresponding execution subject does not include the connection information of the source database and the metadata of the graph data model, the connection information and the metadata of the graph data model need to be acquired and stored from other paths.
Fig. 2 shows a timing diagram of a graph data storage method based on janus graph in an embodiment of the present application.
Referring to fig. 2, the system architecture includes a source database, a business system, a data preparation module, janus graph, a columnar database, and a search engine. The data model definition and modeling operations are as follows:
a1, registering source data in a service system, wherein registration information comprises URL (uniform resource locator), user name, password, database table structure metadata, connection information and the like;
a2, the user performs definition operation of the graph data model in an interactive interface provided by the service system, and metadata of the graph data model generated after the operation is completed are stored in the service system;
a3, the JanusGraph acquires metadata of the graph data model from the service system;
a4, defining the data structures of the columnar database and the search engine according to the metadata of the graph data model.
The pre-operation before data writing is as follows:
b1, a data preparation module acquires connection information of a source database and metadata of a graph data model from a service system;
and B2, connecting the source database by the data preparation module according to the connection information of the source database.
The related operations of data writing are as follows:
c1, extracting source data from a source database by a data preparation module;
the data preparation module converts the source data into formatted data according to the metadata of the graph data model;
c3, the data preparation module analyzes the formatted data to obtain necessary diagram data and full-scale diagram data;
c4, the data preparation module sends the necessary diagram data and the full-quantity diagram data to the JanusGraph;
and C5, writing necessary graph data into a column database by JanusGraph, and writing the total graph data into a search engine.
It will be appreciated that the data model definition and modeling operations and the pre-operations prior to data writing are both one-time operations and do not need to be repeated unless the graph data model needs to be modified or the source database is disconnected. The related operations of data writing are continued as data updating.
Further, in an embodiment, the step of converting the source data into the necessary map data and the full map data includes:
converting the source data into necessary diagram data according to metadata of a column database;
the source data is converted into full graph data according to metadata of the search engine.
In this embodiment, metadata of the columnar database and the search engine includes related information of the data structure, and it can be known by the corresponding metadata what content the necessary map data and the full-scale map data need to be converted should include.
Thus, the metadata of the graph data model defined by the user is not required to be acquired, and the data conversion operation can be completed according to the metadata of the columnar database and the search engine after the data structures of the columnar database and the search engine are defined. The data model definition and modeling operations may be the same as or different from the previous embodiments, none of which affect the implementation of this embodiment.
Further, before step S11, the method further includes:
connecting the source database according to the connection information of the source database;
metadata of a columnar database and metadata of a search engine are obtained.
In this embodiment, the source database needs to be connected before the data extraction operation, and the data conversion operation needs to be based on metadata of the column database and the search engine. If the corresponding execution body does not include the connection information of the source database and the metadata of the columnar database and the search engine, the connection information and the metadata of the columnar database and the search engine need to be acquired and stored from other paths.
Fig. 3 shows a timing diagram of a graph data storage method based on janus graph in another embodiment of the present application.
Referring to fig. 3, the system architecture includes a source database, a data preparation module, a janus graph, a column database, and a search engine. Fig. 3 does not limit the data model definition and modeling operations, which may be the same or different from steps A1 to A4 in fig. 2. The pre-operation before data writing is as follows:
the data preparation module is connected with the source database according to the connection information of the source database;
d2, the data preparation module acquires metadata of the columnar database and metadata of the search engine from the janus graph.
The related operations of data writing are as follows:
e1, extracting source data from a source database by a data preparation module;
e2, the data preparation module converts the source data into necessary diagram data according to metadata of the column database;
e3, the data preparation module converts the source data into full-volume graph data according to the metadata of the search engine;
e4, the data preparation module sends the necessary diagram data and the full-quantity diagram data to the JanusGraph;
and E5, writing necessary graph data into a column database by JanusGraph, and writing the total graph data into a search engine.
In a second aspect, an embodiment of the present application further provides a graph data storage device based on janus graph.
In one embodiment, a JanusGraph-based graph data storage device includes:
the data preparation module is used for extracting source data and converting the source data into necessary diagram data and full-quantity diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-quantity diagram data comprises full-quantity attribute information of the diagram data model;
the first writing module is used for writing necessary graph data into the column database through the JanusGraph, wherein the data structure of the column database is defined according to the necessary attribute of the graph data model;
and the second writing module is used for writing the full-volume graph data into the search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
Further, in an embodiment, the graph data storage device based on janus graph further includes:
the model definition module is used for defining a graph data model based on user operation and generating metadata of the graph data model, wherein the graph data model identifies necessary attributes in a service scene;
the structure definition module is used for defining the data structures of the columnar database and the search engine according to the metadata of the graph data model through JanusGraph, wherein JanusGraph can operate the search engine for each attribute, and can also operate the columnar database for the identified necessary attribute.
Further, in an embodiment, the data preparation module is configured to:
converting the source data into formatted data according to metadata of the graph data model, wherein the formatted data comprises the full attribute information and attribute identification information of the graph data model;
and analyzing the formatted data to obtain necessary diagram data and total diagram data, wherein attribute information is added to the total diagram data for each attribute, and attribute information is added to the necessary diagram data for the identified necessary attribute.
Further, in an embodiment, the data preparation module is further configured to:
and connecting the source database according to the connection information of the source database.
Further, in an embodiment, the data preparation module is configured to:
converting the source data into necessary diagram data according to metadata of a column database;
the source data is converted into full graph data according to metadata of the search engine.
Further, in an embodiment, the data preparation module is further configured to:
connecting the source database according to the connection information of the source database;
metadata of a columnar database and metadata of a search engine are obtained.
Further, in an embodiment, the columnar database is one of HBase and Cassandra, and the search engine is one of Elasticsearch, solr and Lucene.
The function implementation of each module in the graph data storage device based on the janus graph corresponds to each step in the graph data storage method embodiment based on the janus graph, and the function and implementation process of each module are not described here again.
In a third aspect, an embodiment of the present application provides a graph data storage device based on janus graph, where the graph data storage device based on janus graph may be a personal computer (personal computer, PC), a notebook computer, a server, or other devices with a data processing function.
Fig. 4 shows a schematic hardware structure of a graph data storage device based on janus graph in an embodiment of the present application.
Referring to fig. 4, in an embodiment of the present application, a janus graph-based graph data storage device may include a processor, a memory, a communication interface, and a communication bus.
The communication bus may be of any type for implementing the processor, memory, and communication interface interconnections.
The communication interfaces include input/output (I/O) interfaces, physical interfaces, logical interfaces, and the like for implementing device interconnection inside the janus graph-based graph data storage device, and interfaces for implementing interconnection of the janus graph-based graph data storage device with other devices (e.g., other computing devices or user devices). The physical interface may be an ethernet interface, a fiber optic interface, an ATM interface, etc.; the user device may be a Display, a Keyboard (Keyboard), or the like.
The memory may be various types of storage media such as random access memory (random access memory, RAM), read-only memory (ROM), nonvolatile RAM (non-volatile RAM, NVRAM), flash memory, optical memory, hard disk, programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (electrically erasable PROM, EEPROM), and the like.
The processor may be a general-purpose processor, and the general-purpose processor may call a graph data storage program based on janus graph stored in the memory, and execute the graph data storage method based on janus graph provided in the embodiment of the present application. For example, the general purpose processor may be a central processing unit (central processing unit, CPU). The method executed when the graph data storage program based on janus graph is called may refer to various embodiments of the graph data storage method based on janus graph of the present application, which are not described herein.
Those skilled in the art will appreciate that the hardware configuration shown in fig. 4 is not limiting of the application and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
In a fourth aspect, embodiments of the present application further provide a storage medium.
The storage medium of the application stores a graph data storage program based on JanusGraph, wherein when the graph data storage program based on JanusGraph is executed by a processor, the steps of the graph data storage method based on JanusGraph are realized.
The method implemented when the graph data storage program based on janus graph is executed may refer to various embodiments of the graph data storage method based on janus graph of the present application, which are not described herein.
It should be noted that, the foregoing embodiment numbers are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments.
The terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the foregoing drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and "third," etc. are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order, and are not limited to the fact that "first," "second," and "third" are not identical.
In the description of embodiments of the present application, "exemplary," "such as," or "for example," etc., are used to indicate an example, instance, or illustration. Any embodiment or design described herein as "exemplary," "such as" or "for example" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary," "such as" or "for example," etc., is intended to present related concepts in a concrete fashion.
In the description of the embodiments of the present application, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and in addition, in the description of the embodiments of the present application, "plural" means two or more than two.
In some of the processes described in the embodiments of the present application, a plurality of operations or steps occurring in a particular order are included, but it should be understood that these operations or steps may be performed out of the order in which they occur in the embodiments of the present application or in parallel, the sequence numbers of the operations merely serve to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the processes may include more or fewer operations, and the operations or steps may be performed in sequence or in parallel, and the operations or steps may be combined.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising several instructions for causing a terminal device to perform the method described in the various embodiments of the present application.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims of the present application.

Claims (10)

1. The JanusGraph-based graph data storage method is characterized by comprising the following steps of:
extracting source data and converting the source data into necessary diagram data and full-scale diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-scale diagram data comprises full-scale attribute information of the diagram data model;
writing necessary graph data into a columnar database through JanusGraph, wherein the data structure of the columnar database is defined according to necessary attributes of a graph data model;
and writing the full-volume graph data into a search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
2. The janus graph-based graph data storage method of claim 1, further comprising, prior to the step of extracting source data and converting the source data into the necessary graph data and the full-scale graph data:
defining a graph data model based on user operation, and generating metadata of the graph data model, wherein the graph data model identifies necessary attributes in a service scene;
and defining the data structures of the columnar database and the search engine according to the metadata of the graph data model through JanusGraph, wherein the JanusGraph operates the search engine for each attribute and also operates the columnar database for the identified necessary attribute.
3. The janus graph-based graph data storage method of claim 2, wherein the converting the source data into the necessary graph data and the full-scale graph data includes:
converting the source data into formatted data according to metadata of the graph data model, wherein the formatted data comprises the full attribute information and attribute identification information of the graph data model;
and analyzing the formatted data to obtain necessary diagram data and total diagram data, wherein attribute information is added to the total diagram data for each attribute, and attribute information is added to the necessary diagram data for the identified necessary attribute.
4. The janus graph-based graph data storage method of claim 3, further comprising, prior to the step of extracting source data and converting the source data into the necessary graph data and the full-scale graph data:
and connecting the source database according to the connection information of the source database.
5. The janus graph-based graph data storage method of claim 1, wherein the converting the source data into the necessary graph data and the full-scale graph data includes:
converting the source data into necessary diagram data according to metadata of a column database;
the source data is converted into full graph data according to metadata of the search engine.
6. The janus graph-based graph data storage method of claim 5, further comprising, prior to the step of extracting source data and converting the source data into the necessary graph data and the full graph data:
connecting the source database according to the connection information of the source database;
metadata of a columnar database and metadata of a search engine are obtained.
7. The janus graph-based graph data storage method of any one of claims 1 to 6, wherein the columnar database is one of HBase and Cassandra, and the search engine is one of Elasticsearch, solr and Lucene.
8. A janus graph-based graph data storage device, the janus graph-based graph data storage device comprising:
the data preparation module is used for extracting source data and converting the source data into necessary diagram data and full-quantity diagram data, wherein the necessary diagram data comprises necessary attribute information of a diagram data model in a business scene, and the full-quantity diagram data comprises full-quantity attribute information of the diagram data model;
the first writing module is used for writing necessary graph data into the column database through the JanusGraph, wherein the data structure of the column database is defined according to the necessary attribute of the graph data model;
and the second writing module is used for writing the full-volume graph data into the search engine through the JanusGraph, wherein the data structure of the search engine is defined according to the full-volume attribute of the graph data model.
9. A janus graph based graph data storage device, characterized in that the janus graph based graph data storage device comprises a processor, a memory, and a janus graph based graph data storage program stored on the memory and executable by the processor, wherein the janus graph based graph data storage program, when executed by the processor, implements the steps of the janus graph based graph data storage method according to any of claims 1 to 7.
10. A storage medium, wherein a janus graph-based graph data storage program is stored on the storage medium, wherein the janus graph-based graph data storage program, when executed by a processor, implements the steps of the janus graph-based graph data storage method according to any one of claims 1 to 7.
CN202410136687.5A 2024-01-31 2024-01-31 Picture data storage method, device, equipment and storage medium based on JanusGraph Active CN117708384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410136687.5A CN117708384B (en) 2024-01-31 2024-01-31 Picture data storage method, device, equipment and storage medium based on JanusGraph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410136687.5A CN117708384B (en) 2024-01-31 2024-01-31 Picture data storage method, device, equipment and storage medium based on JanusGraph

Publications (2)

Publication Number Publication Date
CN117708384A true CN117708384A (en) 2024-03-15
CN117708384B CN117708384B (en) 2024-08-09

Family

ID=90155573

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410136687.5A Active CN117708384B (en) 2024-01-31 2024-01-31 Picture data storage method, device, equipment and storage medium based on JanusGraph

Country Status (1)

Country Link
CN (1) CN117708384B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12406006B2 (en) 2021-11-02 2025-09-02 Alipay (Hangzhou) Information Technology Co., Ltd. Graph data query

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427359A (en) * 2019-06-27 2019-11-08 苏州浪潮智能科技有限公司 A kind of diagram data treating method and apparatus
US20200097504A1 (en) * 2018-06-07 2020-03-26 Data.World, Inc. Method and system for editing and maintaining a graph schema
CN112905854A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Data processing method and device, computing equipment and storage medium
CN115470284A (en) * 2022-10-12 2022-12-13 中电云数智科技有限公司 A method and device for importing multi-source heterogeneous data sources into Janusgraph graph database
WO2023078120A1 (en) * 2021-11-02 2023-05-11 支付宝(杭州)信息技术有限公司 Graph data querying

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200097504A1 (en) * 2018-06-07 2020-03-26 Data.World, Inc. Method and system for editing and maintaining a graph schema
CN110427359A (en) * 2019-06-27 2019-11-08 苏州浪潮智能科技有限公司 A kind of diagram data treating method and apparatus
CN112905854A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Data processing method and device, computing equipment and storage medium
WO2023078120A1 (en) * 2021-11-02 2023-05-11 支付宝(杭州)信息技术有限公司 Graph data querying
CN115470284A (en) * 2022-10-12 2022-12-13 中电云数智科技有限公司 A method and device for importing multi-source heterogeneous data sources into Janusgraph graph database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
沈志宏;赵子豪;王海波;: "以图为中心的新型大数据技术栈研究", 数据分析与知识发现, no. 07 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12406006B2 (en) 2021-11-02 2025-09-02 Alipay (Hangzhou) Information Technology Co., Ltd. Graph data query

Also Published As

Publication number Publication date
CN117708384B (en) 2024-08-09

Similar Documents

Publication Publication Date Title
US9965641B2 (en) Policy-based data-centric access control in a sorted, distributed key-value data store
US10725981B1 (en) Analyzing big data
US10152607B2 (en) Secure access to hierarchical documents in a sorted, distributed key/value data store
US9361320B1 (en) Modeling big data
CN110162522B (en) Distributed data search system and method
US20220036175A1 (en) Machine learning-based issue classification utilizing combined representations of semantic and state transition graphs
CN113761185B (en) Main key extraction method, device and storage medium
US11100152B2 (en) Data portal
CN111046237A (en) User behavior data processing method and device, electronic equipment and readable medium
US20140379723A1 (en) Automatic method for profile database aggregation, deduplication, and analysis
CN113407785A (en) Data processing method and system based on distributed storage system
US20240220876A1 (en) Artificial intelligence (ai) based data product provisioning
CN113297458A (en) Paging query method, device and equipment
CN111324687A (en) Data processing method and device in knowledge base, computer equipment and storage medium
CN112347055A (en) Medical data processing method and system based on cloud computing
CN117708384B (en) Picture data storage method, device, equipment and storage medium based on JanusGraph
CN114356968A (en) Query statement generation method and device, computer equipment and storage medium
CN111444368B (en) Method and device for constructing user portrait, computer equipment and storage medium
WO2023197865A1 (en) Information storage method and apparatus
US8271493B2 (en) Extensible mechanism for grouping search results
CN115544050A (en) Operation log recording method, device, equipment and storage medium
Freund et al. A formalization of membrane systems with dynamically evolving structures
CN117312574B (en) Information extraction method, device, equipment and storage medium
CN117370339A (en) Report blood edge relationship processing method and device, computer equipment and storage medium
US20230060127A1 (en) Techniques to generate and store graph models from structured and unstructured data in a cloud-based graph database system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant