[go: up one dir, main page]

CN113722284B - A cluster log storage method, device, equipment and medium - Google Patents

A cluster log storage method, device, equipment and medium Download PDF

Info

Publication number
CN113722284B
CN113722284B CN202110873734.0A CN202110873734A CN113722284B CN 113722284 B CN113722284 B CN 113722284B CN 202110873734 A CN202110873734 A CN 202110873734A CN 113722284 B CN113722284 B CN 113722284B
Authority
CN
China
Prior art keywords
log
module
compression
stored
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110873734.0A
Other languages
Chinese (zh)
Other versions
CN113722284A (en
Inventor
董俊明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Jinan data Technology Co ltd
Original Assignee
Inspur Jinan data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Jinan data Technology Co ltd filed Critical Inspur Jinan data Technology Co ltd
Priority to CN202110873734.0A priority Critical patent/CN113722284B/en
Publication of CN113722284A publication Critical patent/CN113722284A/en
Application granted granted Critical
Publication of CN113722284B publication Critical patent/CN113722284B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种集群日志存储方法、装置、设备及介质,包括:统计集群中各目标模块的日志文件的访问频次;基于所述访问频次确定各所述目标模块对应的压缩属性;其中,所述压缩属性包括压缩和不压缩;当监测到集群中有待存储日志,则确定所述待存储日志对应的所述压缩属性;若所述压缩属性为不压缩,则将所述待存储日志存储至日志缓存区,否则,将所述待存储日志压缩以得到第一压缩日志,并将所述第一压缩日志存储至静态存储区。能够提高集群存储空间的利用率和日志访问效率。

The present application discloses a cluster log storage method, device, equipment and medium, including: counting the access frequency of log files of each target module in the cluster; determining the compression attribute corresponding to each target module based on the access frequency; wherein the compression attribute includes compression and non-compression; when it is detected that there is a log to be stored in the cluster, determining the compression attribute corresponding to the log to be stored; if the compression attribute is non-compression, storing the log to be stored in a log buffer area, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log in a static storage area. It can improve the utilization rate of cluster storage space and log access efficiency.

Description

Cluster log storage method, device, equipment and medium
Technical Field
The present application relates to the field of log processing technologies, and in particular, to a method, an apparatus, a device, and a medium for storing a cluster log.
Background
In the current distributed storage cluster environment, the log is important information in the storage cluster and comprises various business operation records, in addition, the historical running state of the cluster and problem positioning during faults are all needed to be analyzed through the log, meanwhile, along with the increasing of time, log data also linearly grows, a large amount of space is occupied by the log data in data storage, the overall space utilization rate of the distributed storage cluster is reduced, meanwhile, due to the fact that the log is numerous, the management and the retrieval of the log become more difficult, and therefore the problem of storing and retrieving massive log data is solved and has important influence on the cluster.
Disclosure of Invention
Accordingly, the present application aims to provide a method, apparatus, device and medium for storing cluster logs, which can improve the utilization rate of the cluster storage space and the log access efficiency. The specific scheme is as follows:
in a first aspect, the application discloses a cluster log storage method, which comprises the following steps:
counting the access frequency of log files of each target module in the cluster;
determining compression attributes corresponding to the target modules based on the access frequency, wherein the compression attributes comprise compression and non-compression;
When the logs to be stored in the cluster are monitored, determining the compression attribute corresponding to the logs to be stored;
And if the compression attribute is not compressed, storing the log to be stored in a log cache area, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log in a static storage area.
Optionally, the counting the access frequency of the log file of each target module in the cluster includes:
counting the access frequency of log files of each functional module and each sub-module of each functional module;
Correspondingly, the determining the compression attribute corresponding to each target module based on the access frequency includes:
And determining the compression attribute corresponding to each target module based on the access times, wherein the access frequency of the function module with the compression attribute being uncompressed is higher than that of the function module with the compression attribute being compressed, and the access frequency of the sub-module with the compression attribute being uncompressed is higher than that of the sub-module with the compression attribute being compressed.
Optionally, the determining, based on the access frequency, a compression attribute corresponding to each target module includes:
Ordering the access frequency of the log files of each functional module;
screening out a preset number of functional modules with highest access frequency to obtain a first module;
determining the compression attribute of the first module as non-compression, and determining the compression attribute of the non-first module in the functional module as compression;
ordering the access frequency of the log files of all the sub-modules which are not the first module in the functional module;
and determining the compression attribute of the sub-module with the first preset proportion, which has the highest access frequency, as non-compression, and determining the compression attribute of other sub-modules which are not the first module as compression.
Optionally, the method further comprises:
storing the compressed attribute of each target module to a log base;
correspondingly, the determining the compression attribute corresponding to the log to be stored includes:
Searching the compression attribute of the functional module to which the log to be stored belongs from the log base, if the compression attribute of the functional module to which the log to be stored belongs is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, if the compression attribute of the functional module to which the log to be stored belongs is compressed, searching the compression attribute of the sub-module to which the log to be stored belongs, if the compression attribute of the sub-module to be stored is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, otherwise, determining the compression attribute corresponding to the log to be stored to be compressed.
Optionally, the method further comprises:
when the log to be stored is stored in the log cache area, the expiration time of the log to be stored is stored in a log base;
Determining a target log from the log cache area based on the expiration time at preset time intervals or at appointed time;
compressing the target log to obtain a second compressed log;
And migrating the second compressed log to the static storage area.
Optionally, the determining, based on the expiration time, the target log from the log cache area includes:
Determining an expiration log from the log cache based on the expiration time;
and randomly selecting a specified number of the outdated logs to obtain target logs.
Optionally, the method further comprises:
when the use proportion of the log cache area reaches a preset threshold, ordering the access frequency of the log files of all sub-modules with uncompressed compression attributes;
Compressing the log files of the submodules with the second preset proportion, which have the minimum access frequency, to obtain a third compressed log;
And migrating the third compressed log to the static storage area.
In a second aspect, the present application discloses a cluster log storage device, including:
The access frequency statistics module is used for counting the access frequency of the log files of each target module in the cluster;
The module attribute determining module is used for determining compression attributes corresponding to the target modules based on the access frequency, wherein the compression attributes comprise compression and non-compression;
The log waiting monitoring module is used for monitoring whether logs waiting to be stored in the cluster;
the compression attribute determining module is used for determining the compression attribute corresponding to the log to be stored when the log to be stored in the cluster is monitored by the log to be stored monitoring module;
and the log storage module is used for storing the log to be stored into a log cache area if the compression attribute is not compressed, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log into a static storage area.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
And the processor is used for executing the computer program to realize the cluster log storage method.
In a fourth aspect, the present application discloses a computer readable storage medium storing a computer program which, when executed by a processor, implements the aforementioned cluster log storage method.
The method comprises the steps of counting access frequency of log files of all target modules in a cluster, determining compression attributes corresponding to all the target modules based on the access frequency, wherein the compression attributes comprise compression and non-compression, determining the compression attributes corresponding to the logs to be stored when the logs to be stored in the cluster are monitored, storing the logs to be stored in a log cache area if the compression attributes are not compressed, otherwise, compressing the logs to be stored to obtain a first compression log, and storing the first compression log in a static storage area. The method and the system can compress and store the logs according to the access frequency of the log files of each module or directly store the logs in the log buffer area without compression, so that on one hand, partial logs are compressed, the utilization rate of the cluster storage space can be improved, and on the other hand, the compressed logs and the uncompressed logs are stored in different areas, and the uncompressed logs can be directly read from the log buffer area, so that the log access efficiency can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a cluster log storage method disclosed by the application;
FIG. 2 is a flowchart of a specific method for storing cluster logs according to the present disclosure;
FIG. 3 is a schematic diagram of a cluster log storage device according to the present disclosure;
FIG. 4 is a schematic diagram of a specific cluster log storage scheme disclosed in the present application;
fig. 5 is a block diagram of an electronic device according to the present disclosure.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the current distributed storage cluster environment, the log is important information in the storage cluster and comprises various business operation records, in addition, the historical running state of the cluster and problem positioning during faults are all needed to be analyzed through the log, meanwhile, along with the increasing of time, log data also linearly grows, a large amount of space is occupied by the log data in data storage, the overall space utilization rate of the distributed storage cluster is reduced, meanwhile, due to the fact that the log is numerous, the management and the retrieval of the log become more difficult, and therefore the problem of storing and retrieving massive log data is solved and has important influence on the cluster. Therefore, the application provides a cluster log storage scheme which can improve the utilization rate of a cluster storage space and log access efficiency.
Referring to fig. 1, the embodiment of the application discloses a cluster log storage method, which comprises the following steps:
and S11, counting the access frequency of the log files of each target module in the cluster.
And step S12, determining compression attributes corresponding to the target modules based on the access frequency, wherein the compression attributes comprise compression and non-compression.
In a specific embodiment, the access frequency of log files of each function module and sub-modules of each function module can be counted, and the compression attribute corresponding to each target module is determined based on the access frequency, wherein the access frequency of the function module which is not compressed is higher than the access frequency of the function module which is compressed by the compression attribute, and the access frequency of the sub-module which is not compressed is higher than the access frequency of the sub-module which is compressed by the compression attribute.
The functional modules may be alarm modules, cache modules, etc. in the cluster, where each functional module includes a plurality of sub-modules, for example, the alarm modules include alarm sub-modules, event sub-modules, etc.
In a specific embodiment, the access frequency of the log files of each functional module and the sub-modules of each functional module may be counted regularly, and then the compression attribute corresponding to each target module may be determined based on the access frequency. Thus, the access frequency can be updated in time, and the compression attribute can be modified in time.
Further, the log files of the function modules can be sorted, a preset number of the function modules with the highest access frequency are selected to obtain a first module, the compression attribute of the first module is determined to be not compressed, the compression attribute of the non-first module in the function modules is determined to be compressed, the access frequency of the log files of all the sub-modules other than the first module in the function modules is sorted, the compression attribute of the sub-modules with the first preset proportion with the highest access frequency is determined to be not compressed, and the compression attribute of other sub-modules other than the first module is determined to be compressed.
That is, in the embodiment of the present application, the access frequencies of the log files of each functional module may be ordered, a preset number of functional modules with the highest access frequency are selected, the compression attribute is determined to be compressed, then the access frequencies of the log files of the sub-modules of the remaining functional modules are ordered, and the compression attribute of the sub-modules with the first preset proportion with the highest access frequency is determined to be compressed, so that the problem that the access efficiency is reduced due to the fact that the log files with the higher access frequency exist in the remaining functional modules but are stored in a compressed manner can be avoided.
The preset number and the first preset proportion can be configured according to actual conditions.
And step S13, when the logs to be stored in the cluster are monitored, determining the compression attribute corresponding to the logs to be stored.
In a specific embodiment, the compression attribute of each target module may be stored in a log base, the compression attribute of the functional module to which the log to be stored belongs is searched from the log base, if the compression attribute of the functional module to which the log to be stored belongs is not compressed, the compression attribute corresponding to the log to be stored is determined to be not compressed, if the compression attribute of the functional module to which the log to be stored belongs is compressed, the compression attribute of the sub-module to which the log to be stored belongs is searched, if the compression attribute of the sub-module to which the log to be stored belongs is not compressed, the compression attribute corresponding to the log to be stored is determined to be not compressed, otherwise, the compression attribute corresponding to the log to be stored is determined to be compressed.
In a specific embodiment, the log storage task issued by the cluster can be monitored, when the log storage task is monitored, the log to be stored in the cluster is indicated, and the compression attribute corresponding to the log to be stored corresponding to the log storage task is determined for storage.
In addition, a global unique identification number can be generated for the corresponding log storage task, so that task processing is facilitated.
And S14, if the compression attribute is not compressed, storing the log to be stored in a log buffer area, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log in a static storage area.
It can be seen that, in the embodiment of the present application, the access frequency of the log files of each target module in the cluster is counted, and the compression attribute corresponding to each target module is determined based on the access frequency, where the compression attribute includes compression and non-compression, when the log to be stored in the cluster is monitored, the compression attribute corresponding to the log to be stored is determined, if the compression attribute is non-compression, the log to be stored is stored in the log buffer area, otherwise, the log to be stored is compressed to obtain a first compression log, and the first compression log is stored in the static storage area. The method and the system can compress and store the logs according to the access frequency of the log files of each module or directly store the logs in the log buffer area without compression, so that on one hand, partial logs are compressed, the utilization rate of the cluster storage space can be improved, and on the other hand, the compressed logs and the uncompressed logs are stored in different areas, and the uncompressed logs can be directly read from the log buffer area, so that the log access efficiency can be improved.
Referring to fig. 2, the embodiment of the application discloses a cluster log storage method, which comprises the following steps:
and S21, counting the access frequency of the log files of each target module in the cluster.
And S22, determining compression attributes corresponding to the target modules based on the access frequency, wherein the compression attributes comprise compression and non-compression.
Step S23, when the logs to be stored in the cluster are monitored, determining the compression attribute corresponding to the logs to be stored.
And step S24, if the compression attribute is not compressed, storing the log to be stored in a log buffer area, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log in a static storage area.
And S25, when the log to be stored is stored in the log buffer area, storing the expiration time of the log to be stored in a log base library.
And S26, determining a target log from the log cache area according to the expiration time at preset time intervals or at appointed time.
Such as once a day, once a week, or at specified points in time per day, on specified days per month, etc.
In a specific embodiment, the expiration logs can be determined from the log cache area based on the expiration time, and a specified number of the expiration logs are randomly selected to obtain a target log.
And S27, compressing the target log to obtain a second compressed log.
And step S28, migrating the second compressed log to the static storage area.
It can be understood that the embodiment of the application can process the expired log information, and in order to reduce the pressure of processing the expired log, the expired log can be selected from the target log in a random sampling manner at preset time intervals or at appointed time, and then compressed and stored.
In addition, in a specific embodiment, when the usage proportion of the log buffer area reaches a preset threshold, the method may sort the access frequencies of the log files of all the sub-modules with uncompressed compression attributes, compress the log files of the sub-modules with the second preset proportion, where the access frequency is the least, to obtain a third compressed log, and migrate the third compressed log to the static storage area.
It can be understood that, because the occupation of the log storage capacity increases linearly, when the capacity usage proportion reaches the threshold value, the log files with the second preset proportion after ranking are forcedly packed according to the access frequency order, and the storage resources are released, wherein the percentage of the released resources and the second preset proportion can be configured.
Further, in this embodiment, when a log reading request issued by the cluster is obtained, the compression attribute and the location information of the corresponding module of the corresponding log are read from the log base library, if the compression attribute is compression, the log is read from the static storage area according to the location information, decompressed and returned, and the log is stored and the log access record is stored in the log base library. If the compression attribute is not compressed, the log is read from the log buffer area according to the position information, and the log is directly returned, and the expiration time of the log is updated.
Referring to fig. 3, an embodiment of the present application discloses a cluster log storage device, including:
The access frequency statistics module 11 is used for counting the access frequency of the log files of each target module in the cluster;
A module attribute determining module 12, configured to determine a compression attribute corresponding to each of the target modules based on the access frequency, where the compression attribute includes compression and non-compression;
the log waiting monitoring module 13 is used for monitoring whether logs waiting to be stored in the cluster;
The compression attribute determining module 14 is configured to determine, when the log to be stored in the cluster is monitored by the log to be stored monitoring module, the compression attribute corresponding to the log to be stored;
and the log storage module 15 is configured to store the log to be stored in the log buffer area if the compression attribute is not compressed, otherwise, compress the log to be stored to obtain a first compressed log, and store the first compressed log in the static storage area.
It can be seen that, in the embodiment of the present application, the access frequency of the log files of each target module in the cluster is counted, and the compression attribute corresponding to each target module is determined based on the access frequency, where the compression attribute includes compression and non-compression, when the log to be stored in the cluster is monitored, the compression attribute corresponding to the log to be stored is determined, if the compression attribute is non-compression, the log to be stored is stored in the log buffer area, otherwise, the log to be stored is compressed to obtain a first compression log, and the first compression log is stored in the static storage area. The method and the system can compress and store the logs according to the access frequency of the log files of each module or directly store the logs in the log buffer area without compression, so that on one hand, partial logs are compressed, the utilization rate of the cluster storage space can be improved, and on the other hand, the compressed logs and the uncompressed logs are stored in different areas, and the uncompressed logs can be directly read from the log buffer area, so that the log access efficiency can be improved.
The access frequency statistics module 11 is specifically configured to count access frequencies of log files of each functional module and sub-modules of each functional module;
Accordingly, the module attribute determining module 12 is specifically configured to:
And determining the compression attribute corresponding to each target module based on the access times, wherein the access frequency of the function module with the compression attribute being uncompressed is higher than that of the function module with the compression attribute being compressed, and the access frequency of the sub-module with the compression attribute being uncompressed is higher than that of the sub-module with the compression attribute being compressed.
In a specific embodiment, the module attribute determining module 12 is specifically configured to:
Ordering the access frequency of the log files of each functional module;
screening out a preset number of functional modules with highest access frequency to obtain a first module;
determining the compression attribute of the first module as non-compression, and determining the compression attribute of the non-first module in the functional module as compression;
ordering the access frequency of the log files of all the sub-modules which are not the first module in the functional module;
and determining the compression attribute of the sub-module with the first preset proportion, which has the highest access frequency, as non-compression, and determining the compression attribute of other sub-modules which are not the first module as compression.
The apparatus further comprises a compression attribute storage module for:
storing the compressed attribute of each target module to a log base;
accordingly, the compression attribute determining module 14 is specifically configured to:
Searching the compression attribute of the functional module to which the log to be stored belongs from the log base, if the compression attribute of the functional module to which the log to be stored belongs is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, if the compression attribute of the functional module to which the log to be stored belongs is compressed, searching the compression attribute of the sub-module to which the log to be stored belongs, if the compression attribute of the sub-module to be stored is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, otherwise, determining the compression attribute corresponding to the log to be stored to be compressed.
The apparatus further comprises:
The expiration time storage module is used for storing the expiration time of the log to be stored into a log base when the log to be stored is stored into the log buffer area;
the target log determining module is used for determining a target log from the log cache area based on the expiration time at preset time intervals or at appointed time;
The log compression module is used for compressing the target log to obtain a second compressed log;
And the first log migration module is used for migrating the second compressed log to the static storage area.
Further, a target log determining module is specifically configured to determine an expiration log from the log buffer based on the expiration time, and randomly select a specified number of the expiration logs to obtain a target log.
Further, the device further comprises a second log migration module, which is used for sorting the access frequencies of the log files of all the sub-modules with the compression attribute not being compressed when the usage proportion of the log buffer area reaches a preset threshold value, compressing the log files of the sub-modules with the second preset proportion, which have the least access frequency, to obtain a third compressed log, and migrating the third compressed log to the static storage area.
For example, referring to fig. 4, an embodiment of the present application discloses a specific cluster log storage scheme. In a specific embodiment, the foregoing modules may be integrated, and a log task monitoring module, a log task management module, a log access module, a log update module, a log data statistics module, a log base library, and the like may be deployed. The system comprises a log task monitoring module, a log task management module and a log task management module, wherein the 1) the log task monitoring module is used for providing log task monitoring and log task issuing functions for a cluster, and when the cluster issues a log storage task, the task is sent to the log task management module for processing, and meanwhile, a global unique identification number is generated for the log task. 2) The log task management module is used for managing the issued log storage task, judging whether compression is needed and acquiring storage position information according to information in a log base when storing the log, storing the compressed log in a log storage server, storing the compressed log in a static storage area, recording the position information, storing the compressed log in a log cache area, recording the position information stored in the log cache area and configuring expiration time in the log base, reading the log according to the position information of the log when reading the log, and directly returning the log data to a user after or without decompression according to compression attributes. 3) And the log access module receives a log reading request issued by the cluster, acquires a target log through the log task management module, stores a log access record in a log base after log access is completed, and if the log access record is a file for accessing the log cache area, simultaneously refreshes the expiration time of the log file. 4) The log data statistics module supports the frequency statistics and ordering of the access of the log files of each module, n (configurable) functional modules with highest frequency of use are directly put into the log buffer, the newly generated logs of the first m (configurable) sub-modules with highest frequency are put into the log buffer, in order to reduce the performance pressure caused by real-time updating in a large-scale cluster, the log statistics analysis and ordering are carried out, the compression attribute of the corresponding module is updated by adopting a timing update strategy (configurable), and the frequency statistics of the log files of the log module is modified in an instant mode. 5) The log updating module processes the log information reaching the expiration time, and in order to reduce the pressure of the system for processing the expiration log file, the module adopts a timing strategy and a regular strategy, wherein the timing strategy is that the system processes the expiration file in a random sampling mode at regular time, the regular strategy is that the expired log is processed in batches regularly, the expired log is migrated from a log buffer area to be packaged and stored, in addition, the occupation of the storage capacity of the log is linearly increased, when the use ratio of the capacity reaches a threshold value, the log updating module forcedly packages the log files of k percent (configurable) submodules after ranking according to the ranking, and releases storage resources (the percentage of the released resources is configurable). 6) And the log base is used for storing the log access record and the compression attribute of each module of the log and the expiration time information of the log file. Through the module, the log data access frequency is counted, the log data is divided into high and low frequency, the log data with low frequency is packaged and stored, the log with high frequency is stored in the log buffer area, and the utilization rate of the storage space and the log access efficiency are improved. The automatic mode is adopted to process the expired log regularly and regularly, the cluster storage utilization rate is improved, manual migration is avoided, so that the distributed storage cluster can optimize the storage of log files to a certain extent, the occupied space of the log is reduced, the retrieval efficiency of the log is improved, and the data reliability and stability of the cluster are improved.
Referring to fig. 5, an embodiment of the present application discloses an electronic device 20, which includes a processor 21 and a memory 22, wherein the memory 22 is used for storing a computer program, and the processor 21 is used for executing the computer program, and the cluster log storage method disclosed in the foregoing embodiment.
For the specific process of the cluster log storage method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk or an optical disk, and the storage mode may be transient storage or permanent storage.
In addition, the electronic device 20 further includes a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26, where the power supply 23 is configured to provide working voltages for each hardware device on the electronic device 20, the communication interface 24 is capable of creating a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed by the communication interface is any communication protocol applicable to the technical solution of the present application, and is not specifically limited herein, and the input/output interface 25 is configured to obtain external input data or output data to the external device, and a specific interface type thereof may be selected according to specific application needs and is not specifically limited herein.
Further, the embodiment of the application also discloses a computer readable storage medium for storing a computer program, wherein the computer program is executed by a processor to realize the cluster log storage method disclosed in the previous embodiment.
For the specific process of the cluster log storage method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The foregoing describes the method, apparatus, device and medium for cluster log storage in detail, and specific examples are provided herein to illustrate the principles and embodiments of the present application, and the above examples are provided to assist in understanding the method and core ideas of the present application, and meanwhile, to those skilled in the art, according to the ideas of the present application, there are variations in the specific embodiments and application scope, so the disclosure should not be construed as limiting the application.

Claims (9)

1. A cluster log storage method, comprising:
Counting the access frequency of log files of each target module in the cluster, wherein the target modules are functional modules, and each functional module comprises a plurality of sub-modules;
the method comprises the steps of determining the compression attribute corresponding to each target module based on the access frequency, namely firstly sequencing the access frequency of log files of each functional module, selecting a preset number of functional modules with highest access frequency, determining the compression attribute as uncompressed, sequencing the access frequency of log files of sub-modules of the rest functional modules, and determining the compression attribute of the sub-modules with the first preset proportion with the highest access frequency as uncompressed, wherein the compression attribute comprises compression and non-compression;
When the logs to be stored in the cluster are monitored, determining the compression attribute corresponding to the logs to be stored;
If the compression attribute is not compressed, storing the log to be stored in a log cache area, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log in a static storage area;
the determining the compression attribute corresponding to the log to be stored is:
storing the compressed attribute of each target module to a log base;
Searching the compression attribute of the functional module to which the log to be stored belongs from the log base, if the compression attribute of the functional module to which the log to be stored belongs is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, if the compression attribute of the functional module to which the log to be stored belongs is compressed, searching the compression attribute of the sub-module to which the log to be stored belongs, if the compression attribute of the sub-module to be stored is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, otherwise, determining the compression attribute corresponding to the log to be stored to be compressed.
2. The cluster log storage method according to claim 1, wherein counting the access frequency of log files of each target module in the cluster includes:
counting the access frequency of log files of each functional module and each sub-module of each functional module;
Correspondingly, the determining the compression attribute corresponding to each target module based on the access frequency includes:
And determining the compression attribute corresponding to each target module based on the access frequency, wherein the access frequency of the function module with the compression attribute being uncompressed is higher than that of the function module with the compression attribute being compressed, and the access frequency of the sub-module with the compression attribute being uncompressed is higher than that of the sub-module with the compression attribute being compressed.
3. The method of cluster log storage according to claim 2, wherein determining the compression attribute corresponding to each of the target modules based on the access frequency includes:
Ordering the access frequency of the log files of each functional module;
screening out a preset number of functional modules with highest access frequency to obtain a first module;
determining the compression attribute of the first module as non-compression, and determining the compression attribute of the non-first module in the functional module as compression;
ordering the access frequency of the log files of all the sub-modules which are not the first module in the functional module;
and determining the compression attribute of the sub-module with the first preset proportion, which has the highest access frequency, as non-compression, and determining the compression attribute of other sub-modules which are not the first module as compression.
4. The cluster log storage method of claim 1, further comprising:
when the log to be stored is stored in the log cache area, the expiration time of the log to be stored is stored in a log base;
Determining a target log from the log cache area based on the expiration time at preset time intervals or at appointed time;
compressing the target log to obtain a second compressed log;
And migrating the second compressed log to the static storage area.
5. The method of cluster log storage according to claim 4, wherein determining the target log from the log cache based on the expiration time comprises:
Determining an expiration log from the log cache based on the expiration time;
and randomly selecting a specified number of the outdated logs to obtain target logs.
6. The cluster log storage method of claim 2, further comprising:
when the use proportion of the log cache area reaches a preset threshold, ordering the access frequency of the log files of all sub-modules with uncompressed compression attributes;
Compressing the log files of the submodules with the second preset proportion, which have the minimum access frequency, to obtain a third compressed log;
And migrating the third compressed log to the static storage area.
7. A cluster log storage device, comprising:
the system comprises an access frequency statistics module, a log file acquisition module and a log file acquisition module, wherein the access frequency statistics module is used for counting the access frequency of log files of each target module in a cluster;
The module attribute determining module is used for determining the compression attribute corresponding to each target module based on the access frequency, namely firstly sorting the access frequency of the log files of each functional module, selecting a preset number of functional modules with highest access frequency, determining the compression attribute as uncompressed, and then sorting the access frequency of the log files of the sub-modules of the rest functional modules, and determining the compression attribute of the sub-modules with the first preset proportion with the highest access frequency as uncompressed;
The log waiting monitoring module is used for monitoring whether logs waiting to be stored in the cluster;
the compression attribute determining module is used for determining the compression attribute corresponding to the log to be stored when the log to be stored in the cluster is monitored by the log to be stored monitoring module;
The log storage module is used for storing the log to be stored into a log cache area if the compression attribute is not compressed, otherwise, compressing the log to be stored to obtain a first compressed log, and storing the first compressed log into a static storage area;
The compressed attribute storage module is used for storing the compressed attribute of each target module to a log base;
Wherein the compression attribute determination module is further to:
Searching the compression attribute of the functional module to which the log to be stored belongs from the log base, if the compression attribute of the functional module to which the log to be stored belongs is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, if the compression attribute of the functional module to which the log to be stored belongs is compressed, searching the compression attribute of the sub-module to which the log to be stored belongs, if the compression attribute of the sub-module to be stored is not compressed, determining the compression attribute corresponding to the log to be stored to be not compressed, otherwise, determining the compression attribute corresponding to the log to be stored to be compressed.
8. An electronic device, comprising:
a memory for storing a computer program;
A processor for executing the computer program to implement the cluster log storage method of any one of claims 1 to 6.
9. A computer readable storage medium for storing a computer program which when executed by a processor implements the cluster log storage method of any one of claims 1 to 6.
CN202110873734.0A 2021-07-30 2021-07-30 A cluster log storage method, device, equipment and medium Active CN113722284B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110873734.0A CN113722284B (en) 2021-07-30 2021-07-30 A cluster log storage method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110873734.0A CN113722284B (en) 2021-07-30 2021-07-30 A cluster log storage method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN113722284A CN113722284A (en) 2021-11-30
CN113722284B true CN113722284B (en) 2025-03-11

Family

ID=78674550

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110873734.0A Active CN113722284B (en) 2021-07-30 2021-07-30 A cluster log storage method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113722284B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115801305B (en) * 2022-09-08 2023-11-07 武汉思普崚技术有限公司 Network attack detection and identification method and related equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144791A (en) * 2018-09-30 2019-01-04 北京金山云网络技术有限公司 Data conversion storage method, apparatus and data management server

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5325091A (en) * 1992-08-13 1994-06-28 Xerox Corporation Text-compression technique using frequency-ordered array of word-number mappers
JP3704483B2 (en) * 1992-10-22 2005-10-12 日本電気株式会社 File compression processing device
JP2002132546A (en) * 2000-10-24 2002-05-10 Xaxon R & D Corp Storage device
CN102411533A (en) * 2011-08-08 2012-04-11 浪潮电子信息产业股份有限公司 Log-management optimizing method for clustered storage system
CN108804042B (en) * 2018-06-16 2021-06-15 浙江力石科技股份有限公司 A method and system for dynamic processing based on a data group removed from a cache
CN110222020B (en) * 2019-05-07 2023-12-19 平安科技(深圳)有限公司 Log file management method, device, computer equipment and storage medium
US20200356297A1 (en) * 2019-05-10 2020-11-12 Hitachi, Ltd. Method of storage control based on log data types
CN111090625A (en) * 2019-10-12 2020-05-01 苏州浪潮智能科技有限公司 Method, device and medium for compressing management log
CN112527753B (en) * 2020-12-11 2023-05-26 平安科技(深圳)有限公司 DNS analysis record lossless compression method and device, electronic equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144791A (en) * 2018-09-30 2019-01-04 北京金山云网络技术有限公司 Data conversion storage method, apparatus and data management server

Also Published As

Publication number Publication date
CN113722284A (en) 2021-11-30

Similar Documents

Publication Publication Date Title
US10372723B2 (en) Efficient query processing using histograms in a columnar database
CN112262379B (en) Storing data items and identifying stored data items
CN113485999B (en) Data cleaning method, device and server
CN109271435A (en) A kind of data pick-up method and system for supporting breakpoint transmission
CN110727406B (en) Data storage scheduling method and device
CN108228322B (en) Distributed link tracking and analyzing method, server and global scheduler
CN113297245A (en) Method and device for acquiring execution information
CN114416670A (en) Index creating method and device suitable for network disk document, network disk and storage medium
CN113722284B (en) A cluster log storage method, device, equipment and medium
CN108021562B (en) Disk storage method and device applied to distributed file system and distributed file system
US9405786B2 (en) System and method for database flow management
CN114238258A (en) Database data processing method and device, computer equipment and storage medium
CN104317820B (en) Statistical method and device for report forms
CN116303667A (en) Time sequence data persistence method and system based on data service platform
CN112765170B (en) Embedded time sequence data management method and device
CN106648550B (en) Method and device for concurrently executing tasks
CN112000619A (en) Time sequence data storage method, device, equipment and readable storage medium
CN111782588A (en) File reading method, device, equipment and medium
CN113297205B (en) Index construction and data access processing method, device, equipment and medium
CN116303417A (en) Method and electronic device for rebuilding database index
HK40057905A (en) Method and device for acquiring execution information
CN117785827A (en) Log data query method and system
CN120492412A (en) File merging method, device, electronic device and computer program product
US9442992B2 (en) Access system and method for accessing signal data
HK1118402B (en) Method and system for providing log services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant