CN112905127B - Data processing method and data processing system - Google Patents
Data processing method and data processing system Download PDFInfo
- Publication number
- CN112905127B CN112905127B CN202110320495.6A CN202110320495A CN112905127B CN 112905127 B CN112905127 B CN 112905127B CN 202110320495 A CN202110320495 A CN 202110320495A CN 112905127 B CN112905127 B CN 112905127B
- Authority
- CN
- China
- Prior art keywords
- data
- compression
- component
- data processing
- historical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
The present disclosure provides a data processing method and a data processing system. The data processing method is applied to a data processing system, and the data processing system comprises a data compression component and a service coordination component; the data processing method comprises the following steps: compressing, by the data compression component, historical data satisfying data compression conditions, and prohibiting, by the service coordination component, writing and/or reading of the historical data under compression; in response to completion of the compression of the historical data, writing and/or reading of the compressed historical data is allowed by the service coordination component.
Description
Technical Field
The disclosure relates to the field of data processing and the field of new energy power generation, in particular to a data processing method and a data processing system.
Background
The new energy industry has been actively developed in recent years, and wind power generation and photovoltaic power generation are clean energy (also called new energy) power generation technologies which are very important in recent years. The safe and stable operation of the new energy power generation equipment is particularly important for ensuring the stability of a power grid and the power generation capacity of new energy. With the rapid layout of new energy power generation in recent years, the operation data of power generation equipment are more and more, and the application is more and more. This places very high demands on providing real-time data store queries.
The data processing method is many, but is difficult to support mass data processing in the new energy industry well, and can not provide quick retrieval of data while storing or compressing the data. Moreover, the data processing method often ignores the use characteristics and application scenes of the new energy equipment data.
Disclosure of Invention
The data processing of the new energy power generation device requires both high compression rate to reduce the occupied storage space and efficient data retrieval and writing capabilities. This is a big difficulty in the field of new energy big data processing. The invention can compress the data and ensure the usability (such as the readability and the writeability) of the data according to the data use characteristics of the new energy power generation equipment.
According to an embodiment of the present disclosure, there is provided a data processing method applied to a data processing system including a data compression component and a service coordination component; the data processing method comprises the following steps: compressing, by the data compression component, historical data satisfying data compression conditions, and prohibiting, by the service coordination component, writing and/or reading of the historical data under compression; in response to completion of the compression of the historical data, writing and/or reading of the compressed historical data is allowed by the service coordination component.
According to an embodiment of the present disclosure, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, implements the data processing method as described above.
According to an embodiment of the present disclosure, there is provided a computing device including: a processor; and a memory storing a computer program which, when executed by the processor, implements the data processing method as described above.
According to an embodiment of the present disclosure, there is provided a data processing system including: a data compression component configured to compress historical data satisfying data compression conditions; a service coordination component configured to prohibit writing and/or reading of historical data in compression; and in response to the compression of the historical data being completed, allowing the compressed historical data to be written and/or read.
By adopting the data processing method and the data processing system according to the embodiment of the disclosure, at least the following technical effects can be achieved: according to the data use characteristics of the new energy equipment (for example, due to long-time network disconnection required by power grid safety inspection, field maintenance and the like, data delay and supplementary recording are caused), the availability of the data is ensured while the data is selectively compressed, for example, massive historical data can be reasonably compressed to reduce occupied storage space, and meanwhile, the reading and writing availability of the non-compressed historical data is ensured; writing and reading of the compressed data may also be allowed, e.g. the compressed data may be subjected to re-recording and retrieval.
Drawings
The above and other objects and features of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings.
FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present disclosure.
Fig. 2 is a flow chart of a data processing method according to an embodiment of the present disclosure.
Fig. 3 is a flow chart of a data compression process according to an embodiment of the present disclosure.
Fig. 4 is a flow chart of a data processing method according to another embodiment of the present disclosure.
Fig. 5 is a flow chart of a data processing method according to another embodiment of the present disclosure.
Fig. 6 is a flowchart of a data writing process according to an embodiment of the present disclosure.
Fig. 7 is a flowchart of a data reading process according to an embodiment of the present disclosure.
Fig. 8 is a schematic diagram of a computing device according to an embodiment of the disclosure.
Detailed Description
For processing of mass data (for example, mass data of new energy equipment), the conventional data processing technology has low retrieval efficiency of compressed historical data, and is difficult to realize efficient reading and writing of uncompressed data while data is compressed. In addition, the traditional data processing technology often ignores the situation that the history data needs to be subjected to the additional recording for the compression of the history data, and has low retrieval efficiency for the compressed history data.
The invention provides a data processing method and a data processing system, which can compress data and ensure the availability of the data according to the data use characteristics of new energy equipment (for example, due to long-time network disconnection required by power grid safety inspection, field maintenance and the like), for example, massive historical data can be reasonably compressed to reduce occupied storage space, and simultaneously, the reading and writing availability of the non-compressed historical data is also ensured. In addition, the data processing method and the data processing system according to the embodiments of the present disclosure may also allow writing and reading of compressed data, for example, may perform complement and retrieval of compressed data.
FIG. 1 is a schematic diagram of a data processing system 1 according to an embodiment of the present disclosure. As shown in fig. 1, the data processing system 1 may include a data compression component 11 and a service coordination component 12. According to embodiments of the present disclosure, service orchestration component 12 may be a component that orchestrates control of distributed services in a distributed storage system in communication with new energy power generation devices. Optionally, the data processing system 1 may further comprise a data writing component 13. Optionally, the data processing system 1 may further comprise a data reading component 14. The data processing system 1 and its various components described above may be implemented by software, hardware or firmware, or any combination of the above, configured to perform the corresponding functions. The above components may be separate from each other or integrated. The above examples are for illustration only, but the present disclosure is not limited thereto.
According to embodiments of the present disclosure, the data processing system 1 may be configured to process data of a new energy power plant. For example, and without limitation, the data processing system 1 may be installed in a central controller of a new energy power plant (e.g., wind turbine, photovoltaic turbine) (e.g., remote controller of a wind farm, remote controller of a photovoltaic farm). The data processing system 1 and its above components can communicate with each new energy power generation device via wired or wireless connections to process the data of the new energy power generation device. The above examples are for illustration only, but the present disclosure is not limited thereto.
According to embodiments of the present disclosure, the data compression component 11 may be configured to compress historical data that satisfies the data compression conditions. The service orchestration component 12 may be configured to prohibit writing and/or reading historical data that is being compressed; in response to compression of the history data being completed, writing and/or reading of the compressed history data is allowed. The data writing component 13 may be configured to receive a writing event, which may include, but is not limited to, at least one of the following information: a write request, data to be written, a database identification corresponding to the data to be written, write status information, and the like. The data writing component 13 may write the history data with the data to be written included in the writing event according to the instruction of the service coordination component 12. The data writing component 13 may receive a write event from a data store of the new energy power plant. Alternatively, the data writing component 13 may receive a writing event entered by a worker or a user via any input device.
According to embodiments of the present disclosure, service orchestration component 12 may be configured to monitor write events and/or read events to historical data being compressed. The data writing component 13 may be configured to buffer data to be written contained in a writing event in response to the service coordination component 12 monitoring the writing event; and responding to the compression of the historical data, and writing the compressed historical data by utilizing the cached data to be written.
According to embodiments of the present disclosure, the data reading component 14 may be configured to receive a read event, which may include, but is not limited to, a read request, a database identification corresponding to the read request, read status information, and the like. The read request may include, but is not limited to, a query request, a search request, and the like. The data reading component 14 can read (e.g., query, retrieve) the historical data based on the read event as directed by the service orchestration component 12. The data reading component 14 may receive a read event from a controller or the like of the new energy power plant. Alternatively, the data writing component 13 may receive a read event entered by a worker or user via any read-write device.
A data processing method according to an embodiment of the present disclosure will be described below with reference to fig. 2 to 7. The data processing method according to the embodiment of the present disclosure is applicable to a data processing system, for example, the data processing system 1. The operations performed by the various components in the data processing system 1 may be understood with reference to the data processing methods described below.
Fig. 2 is a flow chart of a data processing method according to an embodiment of the present disclosure. A data processing method according to an embodiment of the present disclosure may include: compressing, by the data compression component, the history data satisfying the data compression condition (e.g., performing steps S21 and/or S22 by the data compression component), and prohibiting, by the service orchestration component, writing and/or reading of the history data being compressed (e.g., performing step S23 by the service orchestration component); in response to the compression of the history data being completed, writing and/or reading of the compressed history data is allowed by the service orchestration component (e.g., steps S24 and S25 are performed by the service orchestration component).
As shown in fig. 2, in step S21, it may be determined whether the history data satisfies the data compression condition. If it is determined that the data compression condition is satisfied, step S22 is executed, otherwise, the judgment is continued. The compression processing can be selectively performed on the history data according to the data compression conditions. According to an embodiment of the present disclosure, the data compression condition may include at least one of: a preconfigured time condition, a preconfigured read-write flow condition, a preconfigured state condition, and a preconfigured data table name condition. The present disclosure is not limited thereto. The data compression condition may be set according to the data use characteristics of the new energy device.
According to embodiments of the present disclosure, the preconfigured time condition may include that a point in time of the historical data is before a predetermined period of time of the current time. For example, the predetermined period of time may be set according to a network interruption time period of the new energy device. The new energy devices may be disconnected from the network for a long period of time due to safety checks or field maintenance of the power grid, for example, the network outage may be as long as one to three months or more. Whereas historical data during network outages may be frequently read and/or written after the complement. Thus, the historical data before the network outage duration may be stored in compression (e.g., using a data encoding algorithm and/or a distributed storage algorithm), while the historical data after the network outage duration may be stored in normal (i.e., non-compressed) storage.
According to the embodiment of the disclosure, the commonly stored historical data can be quickly read and/or written, so that the high-efficiency use of the commonly stored historical data can be ensured; compressing stored historical data can effectively reduce occupied storage space, and can also be written and/or read in response to actual read-write requests. Therefore, the reasonable use of the historical data can be ensured while the occupied storage space is effectively reduced.
Alternatively, the preconfigured read-write flow condition may include that the read-write flow of the historical data is less than or equal to a predetermined flow threshold. For example, the predetermined flow rate threshold may be set according to a statistical value (e.g., average, maximum, median, minimum, etc.) of the read-write flow rate of the history data. In this manner, historical data having a read-write flow rate less than or equal to a predetermined flow rate threshold may be stored in compression (e.g., using a data encoding algorithm and/or a distributed storage algorithm), while historical data greater than the predetermined flow rate threshold may be stored in general.
According to an embodiment of the present disclosure, in order to circumvent frequently read-written history data (e.g., a history data table), it is necessary to determine whether the read-write traffic of the history data is less than or equal to a predetermined traffic threshold according to the read-write traffic provided by a read-write monitor (e.g., a read-write monitoring table). For example, the read-write traffic for the historical data may be obtained through an Application Program Interface (API) for read-write traffic monitoring provided by a cluster monitor (cloudera manager) accessing the data processing system.
According to the embodiment of the disclosure, the use rate of the commonly stored historical data is higher (for example, the read-write flow rate is higher), and the commonly stored historical data can be quickly read and/or written because the commonly stored historical data is commonly stored, so that the high-efficiency use of the commonly stored historical data can be ensured; the usage rate of the compressed stored historical data is lower (for example, the read-write flow rate is lower), so that the occupied storage space can be effectively reduced due to the compressed stored historical data, and the compressed stored historical data can also be written and/or read in response to the actual read-write request. Therefore, the reasonable use of the historical data can be ensured while the occupied storage space is effectively reduced.
Alternatively, the pre-configured status condition may include the history data being in a non-read-write state. According to the pre-configured state conditions, the historical data in the non-read-write state can be compressed, so that the data compression and the data read-write are prevented from being in conflict.
Alternatively, the preconfigured data table name condition may include that the data table name of the history data is included in the data table name specified to be compressed. According to the name condition of the data table which is configured in advance, the historical data to be compressed can be accurately identified. For example, historical data to be compressed may be identified by configuring a regular expression corresponding to a name of a data table specified to be compressed.
In step S22, the history data satisfying the data compression condition may be compressed. According to embodiments of the present disclosure, the data compression component may compress the historical data satisfying the data compression condition using a data encoding algorithm and/or a distributed storage algorithm. Further, the history data that does not satisfy the data compression condition may be commonly stored (i.e., non-compressed storage) using a distributed storage algorithm.
According to embodiments of the present disclosure, the data encoding algorithm may include one or more of the following algorithms: an Erasure Code (EC) algorithm, a huffman coding algorithm, a run-length coding algorithm. Optionally, the distributed storage algorithm may include one or more of the following algorithms: snapshot storage algorithm, multi-copy storage algorithm. The multiple copy storage algorithm may include a double copy storage algorithm and a triple copy storage algorithm. For example, the erasure coding algorithm may be an erasure coding algorithm based on the Hadoop distributed file system (Hadoop Distributed FILE SYSTEM, HDFS). The erasure coding algorithm may include an array erasure coding algorithm, a Reed-Solomon (RS) erasure coding algorithm, and a Low-density parity-check (LDPC) erasure coding algorithm.
In one embodiment of the present disclosure, the historical data satisfying the data compression condition may be compressed and stored using an erasure code algorithm and/or a snapshot storage algorithm, and the historical data not satisfying the data compression condition may be commonly stored using a multi-copy storage algorithm (e.g., a double-copy storage algorithm and/or a triple-copy storage algorithm). In another embodiment of the present disclosure, the historical data satisfying the data compression condition may be stored in a compressed manner using a double-copy storage algorithm, and the historical data not satisfying the data compression condition may be stored in a normal manner using a triple-copy storage algorithm.
Therefore, the occupied storage space can be effectively reduced through compressed storage, and meanwhile, the efficient historical data use efficiency can be provided through common storage. The service coordination component can coordinate and control a plurality of services (such as a data writing service provided by the data writing component and a data reading service provided by the data reading component), so that the historical data meeting the data compression condition can be compressed under the condition that the historical data which are not in compression are not influenced.
While step S22 is being performed, step S23 may be performed to prohibit writing and/or reading of the history data being compressed. In step S24, it may be determined whether compression of the history data is completed. For example, it may be periodically determined by the service orchestration component whether compression of the historical data is complete. If the compression of the history data is completed, step S25 is performed to allow writing and/or reading of the compressed history data; otherwise, continuing to judge whether the compression of the historical data is completed.
The data compression process in the data processing method will be described below with reference to fig. 3. Fig. 3 is a flow chart of a data compression process according to an embodiment of the present disclosure. The compression of the history data satisfying the data compression condition using the erasure coding algorithm is described herein as an example, but the present disclosure is not limited thereto, and the history data satisfying the data compression condition may be compressed by other algorithms capable of reducing the occupied memory space.
As shown in fig. 3, the data compression component may compress the historical data satisfying the data compression condition using an erasure coding algorithm.
In step S31, a directory based on the erasure coding algorithm may be created. For example, an erasure coding policy may be utilized to create a corresponding directory for data archiving with respect to historical data that satisfies the data compression conditions. According to embodiments of the present disclosure, a directory based on an erasure coding algorithm may be established on an HDFS in a controller of a new energy power plant, which may be configured to store data using an erasure coding strategy.
In step S32, the historical data satisfying the data compression condition may be migrated to the directory based on the erasure coding algorithm to compress the historical data. For example, the historical data meeting the data compression condition can be copied into a corresponding catalog, so that the storage format of the historical data is converted into the storage format based on the erasure coding strategy, and the compression of the historical data is realized. According to the embodiment of the disclosure, if the historical data meeting the data compression condition is initially stored in the format based on the three-copy storage strategy, after the historical data is migrated to the catalog based on the erasure coding algorithm, the format of the historical data is automatically converted into the storage format based on the erasure coding strategy, so that the storage space occupied by the historical data is remarkably reduced, and the compression of the historical data is realized.
In step S33, the name of the directory based on the erasure coding algorithm may be replaced with the original directory name of the history data. According to the embodiment of the disclosure, the original directory name of the historical data can be replaced by the directory name based on the erasure code algorithm, so that the compressed historical data can be retrieved by identifying the original directory name, and the readability of the compressed historical data is ensured.
According to an embodiment of the present disclosure, the data processing method may further include: before compressing the historical data meeting the data compression conditions, the historical data meeting the data compression conditions can be backed up through the data compression component; in response to a failure or timeout of compression of the historical data, rollback is performed with the backed-up historical data to restore the historical data.
Fig. 4 is a flow chart of a data processing method according to another embodiment of the present disclosure.
As shown in fig. 4, in step S41, the history data satisfying the data compression condition may be backed up. Step S42 may be performed after the backup to compress the history data. By backing up the historical data, the file damage of the historical data, which is caused when errors occur in the data compression process, can be prevented from being recovered.
In step S43, it may be determined whether compression of the history data fails or times out. If it is determined to fail or timeout, step S44 is performed to rollback with the backed-up history data to restore the history data; otherwise, step S42 is continued or step S43 is cyclically executed. By rolling back by using the backed-up history data, the history data of compression failure or compression timeout can be restored to the original state, storage errors caused by compression failure or compression timeout can be avoided, and the compressed history data can be cleared.
During data compression, there may be a user or external device that wants to write (e.g., complement) and/or read the historical data being compressed, but the data compression should be done independently of the data writing/reading. Thus, during data compression, write events and/or read events need to be monitored. The following is an example description in connection with fig. 5, but the present disclosure is not limited thereto.
Fig. 5 is a flow chart of a data processing method according to another embodiment of the present disclosure. According to embodiments of the present disclosure, write events and/or read events to historical data being compressed may be monitored by a service orchestration component. In addition, any historical data that is not under compression may also be allowed to be read and/or written (e.g., by the service orchestration component).
As shown in fig. 5, at step S51, a write event and/or a read event to the history data being compressed may be monitored.
For example, but not limited to, a compression event may be received from a data compression component through a service orchestration component. The data compression component may actively send a compression event to the service orchestration component in response to an indication by the service orchestration component or upon initiation of data compression. The compression event may include, but is not limited to, at least one of the following information: identification related to the history data being compressed, compression status information, etc. The compression state information may include, but is not limited to, at least one of the following: a notification that the historical data is beginning to compress or is being compressed (e.g., available to notify that reading and/or writing is prohibited), a notification that the historical data is ending to compress (e.g., available to notify that reading and/or writing is allowed).
The service orchestration component may send all or part of the information in the compaction event (e.g., an identification related to the historical data being compacted, a notification that the historical data begins or is being compacted, a notification that the historical data ends compaction, etc.) to the data writing component and/or the data reading component to inform the data writing component whether writing is allowed and/or to inform the data reading component whether reading is allowed.
Further, a write event can be received from the data write component through the service orchestration component. The data writing component may actively send a write event to the service orchestration component in response to an indication by the service orchestration component or upon a request for a data write. For example, a write event may be registered in the service orchestration component by the data writing component. The write event may include, but is not limited to, at least one of the following information: a write request, data to be written, a database identification corresponding to the data to be written, write status information, and the like. The write status information may include, but is not limited to, at least one of the following: a notification that the history data starts writing or is being written (e.g., may be used to notify that compression of the data to be written is prohibited or compression of the history data in the database corresponding to the data to be written is prohibited), a notification that the history data ends writing (e.g., may be used to notify that compression of the data to be written is permitted and/or compression of the history data in the database corresponding to the data to be written is permitted).
The service orchestration component may send all or part of the information in the write event (e.g., a write request for the historical data being compressed) to the data compression component to inform the data compression component whether to allow compression of the data to be written and/or allow compression of the historical data in the database corresponding to the data to be written.
Further, a read event can be received from the data reading component through the service orchestration component. The data reading component may actively send a read event to the service orchestration component in response to an indication by the service orchestration component or upon request for a data read. For example, a read event may be registered in the service orchestration component by the data reading component. The read event may include, but is not limited to, at least one of the following information: a read request, a database identification corresponding to the read request, read status information, etc. The read status information may include, but is not limited to, at least one of the following: a notification that the history data starts to be read or is being read (e.g., available to notify that compression of the history data in the database corresponding to the read request is prohibited), a notification that the history data ends to be read (e.g., available to notify that compression of the history data in the database corresponding to the read request is permitted).
The service orchestration component may send all or part of the information in the read event (e.g., a write request for the history data being compressed) to the data compression component to inform the data compression component whether the data compression component allows compression of the history data in the database corresponding to the read request.
In step S52, it may be determined whether a write event is monitored. For example, it may be determined whether a write event is received from a data write component through a service orchestration component. In this embodiment, the write event includes a write request for the history data being compressed and the data to be written. Therefore, if a write event is detected, step S53 is performed, otherwise step S55 is performed.
In step S53, in response to the monitoring of the write event by the service coordination component, the data to be written included in the write event may be cached by the data writing component. Further, a temporary inhibit write notification for the history data being compressed may also be output by the service orchestration component. This can inhibit writing of the history data being compressed. But for any history data that is not under compression, a write operation corresponding to the write request may be allowed to be performed. Since writing cannot be performed for the history data being compressed, the data to be written can be cached for subsequent writing. The cached data to be written may be stored in a data compression component or a service orchestration component.
In step S54, the compressed history data may be written by the data writing component using the cached data to be written in response to the compression of the history data being completed.
According to embodiments of the present disclosure, compressed historical data may be written with cached data to be written based on the data encoding algorithm and/or the distributed storage algorithm described above. For example, the cached data to be written may be compressed and stored into a corresponding database along with the corresponding compressed historical data based on an erasure coding algorithm. Optionally, the compressed history data corresponding to the data to be written may also be restored (i.e., restored to a pre-compression storage format) based on a snapshot storage algorithm; storing the data to be written into a corresponding database (e.g., using a multi-copy storage algorithm) along with the corresponding compressed historical data; and then, compressing and storing the written historical data based on a snapshot storage algorithm. Compared with a snapshot storage algorithm, the erasure code algorithm is more convenient to write the cached data to be written into the compressed historical data.
Thus, the compressed historical data can be subjected to supplementary recording. For example, in the case where the new energy power generation device is disconnected for a long time (due to power grid inspection or the like) and thus the history data cannot be uploaded to the remote control end in time, when the network connection is restored, the history data during the network interruption (for example, 3 months of disconnection) can be uploaded, since the history data before disconnection has been compressed and stored, the history data during the network interruption can be obtained by the data writing component, and the history data during the network interruption can be complemented into the compressed history data. Or if the history data before the network outage is being compressed, the history data during the network outage may be cached for additional logging after the compression is completed.
In step S55, it may be determined whether a read event is monitored. For example, it may be determined whether a read event is received from the data reading component through the service orchestration component. In this embodiment, the read event includes a read request for the history data being compressed, so if the read event is monitored, step S56 is performed, otherwise step S51 is continued to be performed.
In step S56, a prohibited read notification for the history data under compression may be output by the service orchestration component in response to the detection of the read event. This can prohibit reading of the history data being compressed. But for any history data that is not under compression, a read operation corresponding to the read request may be allowed to be performed.
The communication and the respective operations between the data writing component, the data reading component, the data compression component, and the service coordination component are described above, and the data writing process and the data reading process will be described below in connection with fig. 6 and 7, but the present disclosure is not limited thereto.
Fig. 6 is a flowchart of a data writing process according to an embodiment of the present disclosure. The data to be written may be received from the outside by a data writing component and processed to generate a write event.
As shown in fig. 6, data may be received by the data writing component (step S61). For example, data is received from an external device connected to the data writing component and/or data input to the data writing component by a user is received.
After receiving the data, the data may be purged (step S62). For example, data that presents ambiguity, duplications, imperfections, violations of business or logic rules, etc. may be subjected to corresponding cleansing operations. Optionally, the data may also be filtered (step S63). For example, the data may be filtered according to data storage rules of the new energy power plant, etc.
Optionally, the data may also be converted (step S64). For example, the format of the data may be converted to a storable format, such as an integer type, a real type, a boolean type, a character type, and a pointer type, or a combination of two or more of the foregoing types.
Finally, the data may be stored (step S65). For example, the data may be stored in a memory of the data writing component or in a memory connected to the data writing component. According to an embodiment of the present disclosure, data may be stored in a normal storage (i.e., non-compressed storage) manner or data satisfying a data compression condition may be stored in a compressed storage manner.
In addition, data reading may also be performed by the data reading component. Fig. 7 is a flowchart of a data reading process according to an embodiment of the present disclosure. A read request may be received from outside by a data reading component and processed to generate a read event.
As shown in fig. 7, a read request may be received by a data read component (step S71). For example, various query requests, retrieval requests, etc. may be received by the data reading component.
After receiving the read request, the read request may be split (step S72). For example, the read request may be split according to an identification of the new energy power plant, a database identification, a read request time, a data write time corresponding to the read request, a communication protocol associated with the read request, and so on. In this way, accurate identification of the database corresponding to the read request is facilitated.
After splitting the read request, the database may be searched (step S73). For example, a database corresponding to the read request may be searched according to the split read request. The data corresponding to the read request may be obtained by searching the database to generate a read result. Then, the read result may be output (step S74).
A data processing system and a data processing method according to embodiments of the present disclosure are described above with reference to fig. 1 to 7.
According to an embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed, implements a data processing method according to an embodiment of the present disclosure.
In an embodiment of the present disclosure, the computer-readable storage medium may carry one or more programs, which when executed, may implement the following steps described with reference to fig. 2 to 7: the data processing method is applied to a data processing system, and the data processing system comprises a data compression component and a service coordination component; the data processing method comprises the following steps: compressing, by the data compression component, historical data satisfying data compression conditions, and prohibiting, by the service coordination component, writing and/or reading of the historical data under compression; in response to completion of the compression of the historical data, writing and/or reading of the compressed historical data is allowed by the service coordination component.
The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the present disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer program embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing. The computer readable storage medium may be embodied in any device; or may exist alone without being assembled into the device.
A data processing method and a data processing system according to embodiments of the present disclosure have been described above in connection with fig. 1 to 7. Next, a computing device according to an embodiment of the present disclosure is described in connection with fig. 8.
Fig. 8 is a schematic diagram of a computing device according to an embodiment of the disclosure.
Referring to fig. 8, a computing device 8 according to an embodiment of the present disclosure may include a memory 81 and a processor 82, with a computer program 83 stored on the memory 81, which when executed by the processor 82, implements a data processing method according to an embodiment of the present disclosure.
In an embodiment of the present disclosure, the computer program 83, when executed by the processor 82, may implement the operations of the data processing method described with reference to fig. 2 to 7: the data processing method is applied to a data processing system, and the data processing system comprises a data compression component and a service coordination component; the data processing method comprises the following steps: compressing, by the data compression component, historical data satisfying data compression conditions, and prohibiting, by the service coordination component, writing and/or reading of the historical data under compression; in response to completion of the compression of the historical data, writing and/or reading of the compressed historical data is allowed by the service coordination component.
The computing device illustrated in fig. 8 is merely an example and should not be taken as limiting the functionality and scope of use of embodiments of the present disclosure.
A data processing method, a data processing system, and a computing device according to embodiments of the present disclosure have been described above with reference to fig. 1 to 8. However, it should be understood that: the data processing system shown in fig. 1 and its various components may each be configured as software, hardware, firmware, or any combination thereof to perform a particular function, the computing device shown in fig. 8 is not limited to including the components shown above, but rather, some components may be added or deleted as desired, and the above components may also be combined.
By adopting the data processing method and the data processing system according to the embodiment of the disclosure, at least the following technical effects can be achieved: according to the data use characteristics of the new energy equipment (for example, due to long-time network disconnection required by power grid safety inspection, field maintenance and the like, data delay and supplementary recording are caused), the availability of the data is ensured while the data is selectively compressed, for example, massive historical data can be reasonably compressed to reduce occupied storage space, and meanwhile, the reading and writing availability of the non-compressed historical data is ensured; writing and reading of the compressed data may also be allowed, e.g. the compressed data may be subjected to re-recording and retrieval.
The control logic or functions performed by the various components or controllers in the control system may be represented by flow diagrams or similar illustrations in one or more figures. These figures provide representative control strategies and/or logic that may be implemented using one or more processing strategies such as event-driven, interrupt-driven, multi-tasking, multi-threading, and the like. Accordingly, various steps or functions illustrated may be performed in the order illustrated, in parallel, or in some cases omitted. Although not always explicitly shown, one of ordinary skill in the art will recognize that one or more of the steps or functions illustrated may be repeatedly performed depending on the particular processing strategy being used.
Although the present disclosure has been shown and described with respect to the preferred embodiments, it will be understood by those skilled in the art that various modifications and changes may be made to these embodiments without departing from the spirit and scope of the disclosure as defined by the appended claims.
Claims (20)
1. A data processing method, wherein the data processing method is applied to a data processing system, and the data processing system comprises a data compression component and a service coordination component;
the data processing method comprises the following steps:
Compressing, by the data compression component, historical data satisfying data compression conditions, and prohibiting, by the service coordination component, writing and/or reading of the historical data under compression;
in response to completion of compression of the historical data, allowing, by the service coordination component, writing and/or reading of the compressed historical data;
wherein, the data processing method further comprises: monitoring, by the service orchestration component, write events and/or read events to historical data being compressed;
wherein the data processing system further comprises a data writing component, the data processing method further comprising: responding to the service coordination component to monitor the writing event, and caching data to be written contained in the writing event through the data writing component;
allowing, by the service orchestration component, writing and/or reading of the compressed historical data in response to completion of the compression of the historical data includes:
and responding to the compression of the historical data, and carrying out complement on the compressed historical data by using the cached data to be written through the data writing component.
2. The data processing method according to claim 1, wherein the data compression condition includes at least one of: a preconfigured time condition, a preconfigured read-write flow condition, a preconfigured state condition, and a preconfigured data table name condition.
3. A data processing method according to claim 2, wherein the preconfigured time conditions comprise that a point in time of the history data is before a predetermined period of time at the current moment, and/or
The preconfigured read-write flow condition includes that the read-write flow of the historical data is less than or equal to a predetermined flow threshold, and/or
The pre-configured status condition includes that the history data is in a non-read-write state, and/or
The preconfigured data table name condition includes that the data table name of the history data is included in the data table name specified to be compressed.
4. The data processing method of claim 1, wherein disabling writing and/or reading of history data in compression by the service orchestration component comprises: in response to monitoring the read event, a prohibited read notification for the history data being compressed is output by the service orchestration component.
5. A data processing method according to any one of claims 1 to 3, wherein compressing, by the data compression component, the history data satisfying the data compression condition comprises: and compressing the historical data meeting the data compression condition by the data compression component by utilizing a data coding algorithm and/or a distributed storage algorithm.
6. The data processing method of claim 5, wherein the data encoding algorithm comprises one or more of the following algorithms: erasure coding algorithm, huffman coding algorithm, run-length coding algorithm, and/or
The distributed storage algorithm includes one or more of the following algorithms: snapshot storage algorithm, multi-copy storage algorithm.
7. The data processing method according to claim 6, wherein the data compression component compresses the history data satisfying the data compression condition using an erasure coding algorithm, and
The compressing of the historical data meeting the data compression condition by the data compression component through the erasure coding algorithm comprises the following steps:
Creating a catalog based on an erasure coding algorithm;
Migrating the historical data meeting the data compression condition to a catalog based on an erasure coding algorithm to compress the historical data;
and replacing the name of the directory based on the erasure coding algorithm with the original directory name of the historical data.
8. A data processing method according to any one of claims 1 to 3, characterized in that the data processing method further comprises: backing up historical data meeting data compression conditions through the data compression component; and in response to the compression failure or timeout of the historical data, rolling back by using the backed-up historical data to restore the historical data.
9. A data processing method according to any one of claims 1 to 3, characterized in that the data processing method is used for processing data of a new energy power generation device by the data processing system.
10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the data processing method according to any one of claims 1to 9.
11. A computing device, the computing device comprising:
A processor;
Memory storing a computer program which, when executed by a processor, implements the data processing method according to any one of claims 1 to 9.
12. A data processing system, the data processing system comprising:
A data compression component configured to compress historical data satisfying data compression conditions;
A service coordination component configured to prohibit writing and/or reading of historical data in compression; in response to the compression of the history data being completed, allowing writing and/or reading of the compressed history data;
wherein the service orchestration component is further configured to monitor write events and/or read events to historical data being compressed;
Wherein the data processing system further comprises a data writing component configured to:
Responding to the service coordination component to monitor the writing event, and caching data to be written contained in the writing event;
And responding to the compression of the historical data, and performing supplementary recording on the compressed historical data by utilizing the cached data to be written.
13. The data processing system of claim 12, wherein the data compression condition comprises at least one of: a preconfigured time condition, a preconfigured read-write flow condition, a preconfigured state condition, and a preconfigured data table name condition.
14. A data processing system according to claim 13, wherein the preconfigured time conditions comprise that a point in time of the history data is before a predetermined period of time at a current moment in time, and/or
The preconfigured read-write flow condition includes that the read-write flow of the historical data is less than or equal to a predetermined flow threshold, and/or
The pre-configured status condition includes that the history data is in a non-read-write state, and/or
The preconfigured data table name condition includes that the data table name of the history data is included in the data table name specified to be compressed.
15. The data processing system of claim 12, wherein the service orchestration component is configured to: in response to monitoring the read event, a prohibited read notification is output for the historical data being compressed.
16. The data processing system of any of claims 12 to 14, wherein the data compression component is configured to: and compressing the historical data meeting the data compression condition by utilizing a data coding algorithm and/or a distributed storage algorithm.
17. The data processing system of claim 16, wherein the data encoding algorithm comprises one or more of the following algorithms: erasure coding algorithm, huffman coding algorithm, run-length coding algorithm, and/or
The distributed storage algorithm includes one or more of the following algorithms: snapshot storage algorithm, multi-copy storage algorithm.
18. The data processing system of claim 17, wherein the data compression component is configured to:
Creating a catalog based on an erasure coding algorithm;
Migrating the historical data meeting the data compression condition to a catalog based on an erasure coding algorithm to compress the historical data;
and replacing the name of the directory based on the erasure coding algorithm with the original directory name of the historical data.
19. The data processing system of any of claims 12 to 14, wherein the data compression component is further configured to: backing up historical data meeting data compression conditions; and in response to the compression failure or timeout of the historical data, rolling back by using the backed-up historical data to restore the historical data.
20. The data processing system of any one of claims 12 to 14, wherein the data processing system is configured to process data of a new energy power plant.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110320495.6A CN112905127B (en) | 2021-03-25 | 2021-03-25 | Data processing method and data processing system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110320495.6A CN112905127B (en) | 2021-03-25 | 2021-03-25 | Data processing method and data processing system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN112905127A CN112905127A (en) | 2021-06-04 |
| CN112905127B true CN112905127B (en) | 2024-08-09 |
Family
ID=76106457
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110320495.6A Active CN112905127B (en) | 2021-03-25 | 2021-03-25 | Data processing method and data processing system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN112905127B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113753977A (en) * | 2021-09-06 | 2021-12-07 | 北京思源广泰科技有限公司 | Data processing method and system |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103198068A (en) * | 2012-01-07 | 2013-07-10 | 湖南省电力勘测设计院 | Application method of dynamic information database in distribution automation system |
| CN111459404A (en) * | 2020-03-03 | 2020-07-28 | 平安科技(深圳)有限公司 | Data compression method and device, computer equipment and storage medium |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7546431B2 (en) * | 2005-03-21 | 2009-06-09 | Emc Corporation | Distributed open writable snapshot copy facility using file migration policies |
| CN101344893B (en) * | 2008-07-17 | 2010-06-02 | 中兴通讯股份有限公司 | History data access method and apparatus |
| CN102521256B (en) * | 2011-11-17 | 2013-07-10 | 广东电网公司电力科学研究院 | High-reliability data protection method of real-time/historical database |
| CN103593424A (en) * | 2013-11-07 | 2014-02-19 | 浪潮电子信息产业股份有限公司 | Configurable big-data compression processing system integrating software and hardware |
| CN106557272B (en) * | 2015-09-30 | 2019-07-30 | 中国科学院软件研究所 | A kind of efficient sensor historic data archiving method |
| CN107526743B (en) * | 2016-06-21 | 2020-08-07 | 伊姆西Ip控股有限责任公司 | Method and apparatus for compressing file system metadata |
| CN107450856A (en) * | 2017-08-10 | 2017-12-08 | 北京元心科技有限公司 | Writing method and reading method of stored data, corresponding devices and terminals |
| CN109947799B (en) * | 2018-07-25 | 2023-03-24 | 光大环境科技(中国)有限公司 | Method and device for carrying out reverse compression reading on historical data |
| CN109302449B (en) * | 2018-08-31 | 2022-03-15 | 创新先进技术有限公司 | Data writing method, data reading device and server |
| CN109815026A (en) * | 2018-12-18 | 2019-05-28 | 国电南京自动化股份有限公司 | Electric power time series database based on distributed component |
| CN110704199B (en) * | 2019-09-06 | 2024-07-05 | 深圳平安通信科技有限公司 | Data compression method, device, computer equipment and storage medium |
| CN111984610A (en) * | 2020-09-27 | 2020-11-24 | 苏州浪潮智能科技有限公司 | Data compression method, apparatus and computer readable storage medium |
-
2021
- 2021-03-25 CN CN202110320495.6A patent/CN112905127B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103198068A (en) * | 2012-01-07 | 2013-07-10 | 湖南省电力勘测设计院 | Application method of dynamic information database in distribution automation system |
| CN111459404A (en) * | 2020-03-03 | 2020-07-28 | 平安科技(深圳)有限公司 | Data compression method and device, computer equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112905127A (en) | 2021-06-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10216740B2 (en) | System and method for fast parallel data processing in distributed storage systems | |
| US11907561B2 (en) | Data backup method and apparatus | |
| US11023331B2 (en) | Fast recovery of data in a geographically distributed storage environment | |
| US10956374B2 (en) | Data recovery method, apparatus, and system | |
| US11210183B2 (en) | Memory health tracking for differentiated data recovery configurations | |
| US11514025B2 (en) | Snapshot conscious internal file data modification for network-attached storage | |
| CN107423426A (en) | A kind of data archiving method and electronic equipment of block chain block number evidence | |
| US11151030B1 (en) | Method for prediction of the duration of garbage collection for backup storage systems | |
| US10853325B2 (en) | Techniques for optimizing data reduction by understanding application data | |
| CN110609813A (en) | A data storage system and method | |
| CN112905127B (en) | Data processing method and data processing system | |
| CN115658390A (en) | Container disaster tolerance method, system, device, equipment and computer readable storage medium | |
| CN109165112B (en) | Fault recovery method, system and related components of metadata cluster | |
| CN112380067B (en) | A metadata-based big data backup system and method in Hadoop environment | |
| CN108874611A (en) | A kind of construction method and device of test data | |
| CN104063294B (en) | A kind of linux system backup and restoration methods | |
| CN112115206B (en) | A method and device for processing object storage metadata | |
| US11645333B1 (en) | Garbage collection integrated with physical file verification | |
| JP2018185562A (en) | Control program, control method, and information processing apparatus | |
| CN113535482B (en) | Cloud backup chain data backup method and device, equipment and readable medium | |
| JP2015088109A (en) | Storage system, storage control method, and storage control program | |
| CN106383664A (en) | Data storage method and apparatus | |
| US20240070031A1 (en) | Method and system for data recovery by a hardware-independent approach involving prioritization and data recovery selectivity | |
| JP2021052263A (en) | Data compression device and data compression method | |
| CN119537096B (en) | Data backup and recovery method, device, computer equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |