CN118193537A - Method and system for parallel creating openGauss external partition table partition indexes - Google Patents
Method and system for parallel creating openGauss external partition table partition indexes Download PDFInfo
- Publication number
- CN118193537A CN118193537A CN202410422216.0A CN202410422216A CN118193537A CN 118193537 A CN118193537 A CN 118193537A CN 202410422216 A CN202410422216 A CN 202410422216A CN 118193537 A CN118193537 A CN 118193537A
- Authority
- CN
- China
- Prior art keywords
- partition
- index
- sub
- creating
- task queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/466—Transaction processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of local index creation, and provides a method and a system for parallel creation openGauss of partition indexes of an external partition table, wherein the method comprises the following steps: obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list; creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements; setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism; the sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements; the partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state. The invention can improve the efficiency of creating the partition index.
Description
Technical Field
The invention relates to the technical field of local index creation, in particular to a method and a system for parallel creation openGauss of partition indexes of an external partition table.
Background
OpenGuass the database supports partition tables, after the tables are partitioned, the logical tables are still complete tables, and only the data in the tables are physically stored in a plurality of table spaces (on physical files), so that when the data are queried, only the relevant partition is required to be scanned. The index of the partition table is divided into a local index, which means to create an independent index for each partition, and a global index, which means to create an index on the entire table, not limited to any particular partition.
OpenGauss the database supports local indexing of partition tables and, to increase the efficiency of creating indexes, supports creating local indexes in a parallel fashion. But the essence is a simple extension of non-partition table parallel index creation, i.e. sequentially traversing each partition and then creating an index for each partition in the manner of normal table parallel index creation, the main steps include: and reading data in parallel, ordering in parallel to generate an intermediate result, merging the intermediate ordering result, and generating an index. In practical application, this way improves the efficiency of creating the index by improving the efficiency of reading data and the ordering efficiency in parallel, but when the parallelism increases to a certain extent, the efficiency of creating the index is instead degraded, and the reasons include:
1. Limited by openGuass transaction consistency mechanisms, only steps of reading data and ordering to generate intermediate results can be performed in parallel sub-threads, and steps of merging the ordering intermediate results and generating indexes, which relate to DML operation, can be performed in main threads only, so that the advantage of parallelism cannot be effectively exerted;
2. With the increase of parallelism, the memory of each parallel sub-thread which can be used for sequencing to generate an intermediate result is further reduced, the intermediate sequencing result to be merged by the main thread at the later stage is multiplied, and the merging of the intermediate sequencing result and the generation of an index are performed in a single-thread mode and involve a large number of IO operations; when the parallelism is increased to a certain degree, a large amount of intermediate sorting results are generated, and although the performance of reading data and sorting the intermediate results can be improved, the performance of creating indexes is degraded due to the increase of the merging workload of single threads.
Therefore, how to provide a more efficient method for creating the partition index of the external partition table in parallel is a technical problem to be solved.
Disclosure of Invention
In view of the foregoing, the present invention is directed to a method and system for creating openGauss external partition table partition indexes in parallel to overcome the deficiencies of the prior art.
According to a first aspect of the present invention, there is provided a method of creating openGauss external partition table partition indexes in parallel, the method comprising:
Obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list;
creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements;
Setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism;
The sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements;
The partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
Preferably, in the method for creating openGauss external partition table partition indexes in parallel, a partition list of an external partition table is obtained, and corresponding partition index metadata is created for each partition in the partition list, including: and accessing the system table through the main thread, acquiring a partition list of an external partition table from the system table, generating a corresponding partition index name according to the partition name in the partition list, creating an index storage file for the partition index corresponding to the partition, and writing the relation between the partition and the partition index into the system table.
Preferably, in the method for creating openGauss external partition table partition indexes in parallel, a task queue element is created according to partition information and partition index metadata, and a task queue is constructed according to the created task queue element, including: the task queue elements are in one-to-one correspondence with the partitions in the external partition table, and comprise partition definition information, partition data storage information, partition index key definition information and partition index data storage information.
Preferably, in the method for creating openGauss external partition table partition indexes in parallel, parallelism is set for sub-threads for creating partition indexes, and a main thread sends main transaction information and task queue addresses to each sub-thread according to the set parallelism, including:
Setting parallelism according to the partition number of the external partition table and the memory and system resources required for creating partition indexes for each partition;
and the main thread starts the sub-threads according to the set parallelism, copies the transaction information of the main transaction to each sub-thread, and transfers the address of the constructed task queue to each sub-thread.
Preferably, in the method of creating openGauss external partition table partition indexes in parallel, the transaction information of the main transaction includes a transaction ID, a transaction snapshot, and a transaction command ID.
Preferably, in the method for creating openGauss external partition table partition indexes in parallel, the sub-thread preemptively acquires task queue elements from the task queue in parallel according to the addresses of the task queue, creates corresponding partition indexes according to the acquired task queue elements, and includes: the sub-thread reads the partition data according to the definition information of the partition and the storage information of the partition data, generates index data according to key definition information of the partition index, sorts the generated index data, and writes the sorted index data into the index storage file according to the storage information of the partition index data.
Preferably, in the method for creating openGauss external partition table partition indexes in parallel, the sub-thread preemptively acquires task queue elements from the task queue in parallel according to the addresses of the task queue, creates corresponding partition indexes according to the acquired task queue elements, and further includes: after each sub-thread completes a partition index creating task, the sub-thread continues to preemptively acquire task queue elements from the task queue, and creates a corresponding partition index according to the acquired task queue elements.
Preferably, in the method for parallel creating openGauss external partition table partition indexes of the present invention, the creating task state of the partition index of the main thread Cheng Jiance sub-thread, and cancelling the creating or submitting the created partition index according to the detected partition index creating task state, includes: in the process of creating the partition index by the sub-thread, detecting the state of the sub-thread, if the abnormal sub-thread exists, judging the state of the current partition index creation task as abnormal, stopping all the sub-threads, executing the cleaning work of the abnormal termination of the partition index creation by the main thread, and canceling all the operations generated by the partition index creation.
Preferably, in the method for parallel creating openGauss external partition table partition indexes of the present invention, the creating task state of the partition index of the main thread Cheng Jiance sub-thread, and cancelling the creating or submitting the created partition index according to the detected partition index creating task state, includes: if all the sub-threads complete the creation of the partition index, the main thread submits the main transaction according to a transaction mechanism.
According to a second aspect of the present invention, there is provided a system for parallel creation openGauss of external partition table partition indexes, the system comprising a partition index creation server for: obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list; creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements; setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism; the sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements; the partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
According to a third aspect of the present invention there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect of the present invention when executing the program.
According to the method and the system for parallel creating openGauss the partition index of the external partition table, the partition parallel mode is adopted, after the efficiency of creating a single partition is detected, the proper parallelism can be selected according to the number of the partitions and the system resources, and the efficiency of creating the external partition table is improved linearly. The multithreading mechanism in openGauss database is effectively utilized, and the original modules such as data scanning and sequencing during index creation are simultaneously utilized, so that a great amount of development work is reduced. Aiming at the characteristics that only read operation exists in the read-only external partition table and DML operation does not exist, under the service model that the partition data volume is larger and the single partition is smaller, the limitation on transaction consistency in the process of creating the index can be simplified, and the efficiency of creating the partition index is greatly improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a method for parallel creation openGauss of external partition table partition indexes, which is suitable for use in embodiments of the present invention;
FIG. 2 is a flow chart of steps of a method of creating openGauss external partition table partition indexes in parallel, according to an embodiment of the invention;
FIG. 3 is an exemplary diagram of steps for commit partition indexes in a method for creating openGauss external partition table partition indexes in parallel, according to an embodiment of the invention;
Fig. 4 is a schematic structural diagram of the apparatus provided by the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be noted that, without conflict, the following embodiments and features in the embodiments may be combined with each other; and, based on the embodiments in this disclosure, all other embodiments that may be made by one of ordinary skill in the art without inventive effort are within the scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
FIG. 1 illustrates an exemplary system suitable for use in the method of creating openGauss external partition table partition indices in parallel in accordance with an embodiment of the invention. As shown in fig. 1, the system may include a partition index creation server 101, a communication network 102, and/or one or more partition index creation clients 103, which are illustrated in fig. 1 as a plurality of partition index creation clients 103.
The partition index creation server 101 may be any suitable server for storing information, data, programs, and/or any other suitable type of content. In some embodiments, the partition index creation server 101 may perform appropriate functions. For example, in some embodiments, partition index creation server 101 may be configured to create openGauss external partition table partition indexes in parallel. As an alternative example, in some embodiments, partition index creation server 101 may be used to implement parallel creation openGauss of external partition table partition indexes by building a task queue. For example, the partition index creation server 101 may be configured to: obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list; creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements; setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism; the sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements; the partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
As another example, in some embodiments, the partition index creation server 101 may send a method of creating openGauss an external partition table partition index in parallel to the partition index creation client 103 for use by a user, upon request of the partition index creation client 103.
As an optional example, in some embodiments, the partition index creation client 103 is configured to provide a visual partition index creation interface, where the visual partition index creation interface is configured to receive a selection input operation for parallel creating openGauss the external partition table partition index by a user, and, in response to the selection input operation, obtain, from the partition index creation server 101, a partition index creation interface corresponding to an option selected by the selection input operation and display the partition index creation interface, where at least information for parallel creating openGauss the external partition table partition index and an operation option for parallel creating openGauss the external partition table partition index are displayed.
In some embodiments, communication network 102 may be any suitable combination of one or more wired and/or wireless networks. For example, the communication network 102 can include any one or more of the following: the internet, an intranet, a Wide Area Network (WAN), a Local Area Network (LAN), a wireless network, a Digital Subscriber Line (DSL) network, a frame relay network, an Asynchronous Transfer Mode (ATM) network, a Virtual Private Network (VPN), and/or any other suitable communication network. The partition index creation client 103 can be connected to the communication network 102 via one or more communication links (e.g., communication link 104), and the communication network 102 can be linked to the partition index creation server 101 via one or more communication links (e.g., communication link 105). The communication link may be any communication link suitable for transferring data between the partition index creation client 103 and the partition index creation server 101, such as a network link, a dial-up link, a wireless link, a hardwired link, any other suitable communication link, or any suitable combination of such links.
Partition index creation client 103 may include any one or more clients that present interfaces related to parallel creation openGauss of external partition table partition indexes in a suitable form for use and operation by a user. In some embodiments, partition index creation client 103 may comprise any suitable type of device. For example, in some embodiments, partition index creation client 103 may include a mobile device, a tablet computer, a laptop computer, a desktop computer, and/or any other suitable type of client device.
Although the partition index creation server 101 is illustrated as one device, in some embodiments any suitable number of devices may be used to perform the functions performed by the partition index creation server 101. For example, in some embodiments, multiple devices may be used to implement the functions performed by the partition index creation server 101. Or the function of the partition index creation server 101 may be implemented using a cloud service.
Based on the above system, the embodiment of the present invention provides a method for creating openGauss external partition table partition indexes in parallel, which is described in the following embodiments. FIG. 2 is a flow chart of steps of a method of creating openGauss external partition table partition indexes in parallel, according to an embodiment of the invention. The method for creating openGauss an external partition table partition index in parallel according to this embodiment may be executed at a partition index creation server, and the method for creating openGauss an external partition table partition index in parallel includes the following steps:
Step S201: and obtaining a partition list of the external partition table, and creating corresponding partition index metadata for each partition in the partition list.
As an alternative example, creating partition index metadata in the method of this embodiment proceeds as follows: and accessing the system table through the main thread, acquiring a partition list of an external partition table from the system table, generating a corresponding partition index name according to the partition name in the partition list, creating an index storage file for the partition index corresponding to the partition, and writing the relation between the partition and the partition index into the system table.
Creating an index for an external partition table partition corresponds to DDL operations, DDL (Data Definition Languages) being a data definition language, where the statements define the definition of different database objects, such as data segments, databases, tables, columns, indexes, etc. This part of operations are complex, and if the operations are put into a sub-thread to be executed, the processing such as sub-transaction commit and rollback needs to be considered, so that the transaction requirements are high. In this embodiment, metadata of an index is created in a main thread, definition information of the newly created index and corresponding partition definition information are put into a task queue for use by a sub-thread that generates index data in parallel later. The partition index metadata is created in the main thread, and consistency of DDL operation can be ensured through the main thread transaction; since creating partition indexes is an operation on metadata, even sequential execution does not affect performance.
Step S202: and creating a task queue element according to the partition information and the partition index metadata, and constructing a task queue according to the created task queue element.
As an optional example, in the method of this embodiment, the task queue element corresponds to the partition in the external partition table one by one, where the task queue element includes definition information of the partition, storage information of partition data, key definition information of a partition index, and storage information of partition index data. In the method of the embodiment, the task queue element is used for generating index data by scanning partition data subsequently.
Step S203: and setting parallelism for the sub-threads for creating the partition index, and sending the main transaction information and the task queue address to each sub-thread by the main thread according to the set parallelism.
As an alternative example, in the method of this embodiment, parallelism is set according to the number of partitions in the external partition table and the memory and system resources required for creating the partition index for each partition. It should be noted that, in the method of this embodiment, the principle of setting parallelism according to the partition number of the external partition table, the memory and the system resource required by each partition generating index is: the ordering of index tuples in generating the index for each partition may be done in memory while parallelism cannot exceed the cpu core number.
After the parallelism is set, in the method of the embodiment, the main thread starts the sub-threads according to the set parallelism, copies the transaction information of the main transaction to each sub-thread, and transfers the address of the constructed task queue to each sub-thread.
For example, in the method of the embodiment, the main thread starts the sub-threads according to the parallelism, and the main thread copies the transaction information such as the transaction ID, the transaction snapshot, the transaction command ID, and the like of the main transaction to each sub-thread and transfers the constructed task queue address to the sub-threads; the sub-threads create respective transaction management objects that simply multiplex the existing transaction model to generate index data, with information for each transaction coming from the master transaction. That is, each sub-thread is not an independent transaction or sub-transaction, but is a small unit under the main transaction, and because the external partition table is read-only, the snapshot and xlog mechanisms of the main thread transaction can be uniformly utilized, and the sub-threads do not need to independently submit and abnormally roll back the transaction, and finally, the transaction of the main thread is uniformly processed. Thus, the operation of each sub-thread can be considered as part of the main thread transaction, and each sub-thread proceeds in the transaction as if it were normal to generate index data.
According to the method for setting parallelism in the embodiment, the index data of each partition is made to be as small as possible, so that the memory ordering can be used for finishing the process of generating and ordering the index data of each partition, and the efficiency of generating the index data of a single partition is greatly improved due to the fact that IO operations in the process of outward arrangement and merging are avoided. This is in contrast to the intra-partition parallelism-based approach currently employed by openGauss, which requires that the individual partition data be as large as possible in the prior art approach, so that intra-partition parallelism is only meaningful, and that a merge can occur as long as it is, creating a bottleneck point. In practical applications, parallelism = MIN (system available memory/average memory required for per-partition index data memory ordering, system available for creating index cpu cores).
As an optional example, in the method of this embodiment, the main thread creates and starts a corresponding number of sub threads according to the parallelism, at least 2 sub threads are successfully created, otherwise, the mode of creating an index for a single thread is degraded; the main thread copies transaction information such as transaction ID, transaction snapshot, transaction command ID and the like of the transaction to each sub-thread; the sub-threads create respective transaction management objects for transaction related operations when the thread creates the index, such as judging the visibility of partition data and recording recovery log xlog of generated index data, but the information of the transaction is all from the main transaction.
Step S204: the sub-thread preemptively acquires task queue elements from the task queue in parallel according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements.
As an optional example, in the method of this embodiment, the sub-thread reads the partition data according to the partition definition information and the storage information of the partition data, where the partition definition information, the storage information of the partition data, the key definition information of the partition index, and the storage information of the partition index data are included. Generating index data according to key definition information of the partition index, sorting the generated index data, and writing the sorted index data into an index storage file according to storage information of the partition index data. After each sub-thread completes a partition index creating task, the sub-thread continues to preemptively acquire task queue elements from the task queue, and creates a corresponding partition index according to the acquired task queue elements until all tasks in the task queue are processed.
Step S205: the partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
In the method of the embodiment, because the external partition table is read-only, each sub-thread does not need to carry out the submission and abnormal rollback of the transaction, but is used as a part of the transaction of the main thread; after all the sub-threads are finished, the transaction of the main thread determines whether the generated index data is successfully submitted or aborted. Compared with the original intra-partition parallel scheme of openGauss, the method can only parallelize the read data and the sequence, and parallelize the step of generating the index of each partition; while the efficiency of creating a single partition deteriorates to a single threaded mode, the partitions can be parallel to each other and the merging step is reduced, saving a lot of IO operations.
When the sub-thread exits, the result information of the sub-thread execution is returned to the main thread to ensure whether the execution result of the main thread Cheng Jiance sub-thread is abnormal. The main thread submits/aborts the transaction according to the sub-thread execution result: if a certain sub-thread is abnormal, judging that the transaction of creating the partition index is abnormal currently, stopping all the sub-threads, and canceling the operation of creating the partition index; otherwise, after the partition index is created, the transaction is submitted, and the generated partition index can be normally used.
FIG. 3 is an exemplary diagram of steps for commit partition indexes in a method for parallel creation openGauss of external partition table partition indexes in accordance with an embodiment of the present invention. As shown in fig. 3, as an alternative example, in the method of this embodiment, during the process of creating the partition index by the sub-thread, the state of the sub-thread is detected, if there is an abnormal sub-thread, the current state of the partition index creation task is determined to be abnormal, all the sub-threads are suspended, the main thread executes the cleaning work of the abnormal termination of the partition index creation, and all the operations generated by the partition index creation this time are cancelled.
As an alternative example, in the method of this embodiment, if all the child threads complete the creation of the partition index, the main thread commits the main transaction according to the transaction mechanism. In the method of the embodiment, since the transaction information of the main thread transaction multiplexed by each sub-thread is submitted according to the normal transaction mechanism, the index operation created by all the sub-threads is submitted, and the recorded recovery log xlog is consistent with the main thread transaction, so that the normal transaction visibility and the backup/recovery mechanism are ensured.
According to the method and the system for creating openGauss external partition table partition indexes in parallel, through adopting an inter-partition parallel mode, after the efficiency of creating single partition is detected, proper parallelism can be selected according to the number of partitions and system resources, and the efficiency of creating the external partition table is improved linearly. The multithreading mechanism in openGauss database is effectively utilized, and the original modules such as data scanning and sequencing during index creation are simultaneously utilized, so that a great amount of development work is reduced. Aiming at the characteristics that only read operation exists in the read-only external partition table and DML operation does not exist, under the service model that the partition data volume is larger and the single partition is smaller, the limitation on transaction consistency in the process of creating the index can be simplified, and the efficiency of creating the partition index is greatly improved.
As shown in FIG. 4, the present invention also provides an apparatus comprising a processor 310, a communication interface 320, a memory 330 for storing a processor executable computer program, and a communication bus 340. Wherein the processor 310, the communication interface 320 and the memory 330 perform communication with each other through the communication bus 340. The processor 310 implements the method of creating openGauss external partition table partition indexes in parallel described above by running an executable computer program.
The computer program in the memory 330 may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a separate product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The system embodiments described above are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected based on actual needs to achieve the purpose of the embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the various embodiments or methods of some parts of the embodiments.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.
Claims (10)
1. A method of creating openGauss external partition table partition indexes in parallel, the method comprising:
Obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list;
creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements;
Setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism;
The sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements;
The partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
2. The method of parallel creating openGauss an external partition table partition index according to claim 1, wherein obtaining a partition list of the external partition table, creating corresponding partition index metadata for each partition in the partition list, comprises: and accessing the system table through the main thread, acquiring a partition list of an external partition table from the system table, generating a corresponding partition index name according to the partition name in the partition list, creating an index storage file for the partition index corresponding to the partition, and writing the relation between the partition and the partition index into the system table.
3. The method of parallel creating openGauss external partition table partition indexes according to claim 1, wherein creating task queue elements from partition information and partition index metadata, constructing task queues from the created task queue elements, comprises: the task queue elements are in one-to-one correspondence with the partitions in the external partition table, and comprise partition definition information, partition data storage information, partition index key definition information and partition index data storage information.
4. The method of parallel creation openGauss of an external partition table partition index according to claim 1, wherein parallelism is set for sub-threads creating the partition index, the main thread sends main transaction information and task queue addresses to each sub-thread according to the set parallelism, comprising:
Setting parallelism according to the partition number of the external partition table and the memory and system resources required for creating partition indexes for each partition;
and the main thread starts the sub-threads according to the set parallelism, copies the transaction information of the main transaction to each sub-thread, and transfers the address of the constructed task queue to each sub-thread.
5. The method of parallel creating openGauss an external partition table partition index according to claim 4, wherein the transaction information of the master transaction includes a transaction ID, a transaction snapshot, and a transaction command ID.
6. The method of parallel creating openGauss external partition table partition indexes of claim 1, wherein the sub-threads concurrently preemptively acquire task queue elements from the task queues according to addresses of the task queues, create corresponding partition indexes according to the acquired task queue elements, comprising: the sub-thread reads the partition data according to the definition information of the partition and the storage information of the partition data, generates index data according to key definition information of the partition index, sorts the generated index data, and writes the sorted index data into the index storage file according to the storage information of the partition index data.
7. The method of parallel creation openGauss of external partition table partition indexes according to claim 1, wherein the sub-threads concurrently preemptively acquire task queue elements from the task queues according to addresses of the task queues, create corresponding partition indexes according to the acquired task queue elements, further comprising: after each sub-thread completes a partition index creating task, the sub-thread continues to preemptively acquire task queue elements from the task queue, and creates a corresponding partition index according to the acquired task queue elements.
8. The method of parallel creating openGauss external partition table partition indexes of claim 1, wherein creating a task state from the partition indexes of the main thread Cheng Jiance sub-threads, cancelling the partition index creation or commit the partition indexes created from the detected partition index creation task state, comprises: in the process of creating the partition index by the sub-thread, detecting the state of the sub-thread, if the abnormal sub-thread exists, judging the state of the current partition index creation task as abnormal, stopping all the sub-threads, executing the cleaning work of the abnormal termination of the partition index creation by the main thread, and canceling all the operations generated by the partition index creation.
9. The method of parallel creating openGauss external partition table partition indexes of claim 1, wherein creating a task state from the partition indexes of the main thread Cheng Jiance sub-threads, cancelling the partition index creation or commit the partition indexes created from the detected partition index creation task state, comprises: if all the sub-threads complete the creation of the partition index, the main thread submits the main transaction according to a transaction mechanism.
10. A system for creating openGauss external partition table partition indexes in parallel, the system comprising a partition index creation server for: obtaining a partition list of an external partition table, and creating corresponding partition index metadata for each partition in the partition list; creating task queue elements according to partition information and partition index metadata, and constructing a task queue according to the created task queue elements; setting parallelism for sub-threads for creating partition indexes, and transmitting main transaction information and task queue addresses to each sub-thread by the main thread according to the set parallelism; the sub-thread preemptively acquires task queue elements from the task queue according to the addresses of the task queue, and creates corresponding partition indexes according to the acquired task queue elements; the partition index of the main thread Cheng Jiance sub-thread creates a task state, and the partition index creation or the submission of the created partition index is canceled according to the detected partition index creation task state.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410422216.0A CN118193537A (en) | 2024-04-09 | 2024-04-09 | Method and system for parallel creating openGauss external partition table partition indexes |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410422216.0A CN118193537A (en) | 2024-04-09 | 2024-04-09 | Method and system for parallel creating openGauss external partition table partition indexes |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118193537A true CN118193537A (en) | 2024-06-14 |
Family
ID=91413736
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410422216.0A Pending CN118193537A (en) | 2024-04-09 | 2024-04-09 | Method and system for parallel creating openGauss external partition table partition indexes |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118193537A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119166609A (en) * | 2024-08-27 | 2024-12-20 | 广州海量数据库技术有限公司 | A multi-version concurrency control method for openGauss database under distributed multi-write architecture |
| CN120233955A (en) * | 2025-05-29 | 2025-07-01 | 苏州无双医疗设备有限公司 | Implantable medical device data storage system and method |
-
2024
- 2024-04-09 CN CN202410422216.0A patent/CN118193537A/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119166609A (en) * | 2024-08-27 | 2024-12-20 | 广州海量数据库技术有限公司 | A multi-version concurrency control method for openGauss database under distributed multi-write architecture |
| CN120233955A (en) * | 2025-05-29 | 2025-07-01 | 苏州无双医疗设备有限公司 | Implantable medical device data storage system and method |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10657008B2 (en) | Managing a redundant computerized database using a replicated database cache | |
| JP6412632B2 (en) | Database streaming restore from backup system | |
| US9189487B2 (en) | Method for recording transaction log, and database engine | |
| CN111338766B (en) | Transaction processing method, apparatus, computer equipment and storage medium | |
| US11455217B2 (en) | Transaction consistency query support for replicated data from recovery log to external data stores | |
| US10089183B2 (en) | Method and apparatus for reconstructing and checking the consistency of deduplication metadata of a deduplication file system | |
| US11061884B2 (en) | Method and system to accelerate transaction commit using non-volatile memory | |
| CN109643310B (en) | System and method for redistribution of data in a database | |
| US9886443B1 (en) | Distributed NFS metadata server | |
| CN118193537A (en) | Method and system for parallel creating openGauss external partition table partition indexes | |
| US9984139B1 (en) | Publish session framework for datastore operation records | |
| CA2709498A1 (en) | System and method for subunit operations in a database | |
| US12164496B2 (en) | Transaction execution method, computing device, and storage medium | |
| US11797523B2 (en) | Schema and data modification concurrency in query processing pushdown | |
| WO2018040488A1 (en) | Method and device for processing join query | |
| EP4264448A1 (en) | Schema and data modification concurrency in query processing pushdown | |
| CN113157411B (en) | Celery-based reliable configurable task system and device | |
| US10579419B2 (en) | Data analysis in storage system | |
| CN106371919B (en) | A shuffling data cache method based on map-reduce computing model | |
| WO2024098363A1 (en) | Multicore-processor-based concurrent transaction processing method and system | |
| CN118672829A (en) | Method for processing pre-written log of database, system recovery method and related equipment | |
| CN115686802B (en) | Cloud computing cluster scheduling system | |
| Dai et al. | An asynchronous traversal engine for graph-based rich metadata management | |
| US12360961B2 (en) | Hybrid database implementations | |
| US12259891B2 (en) | Hybrid database implementations |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |