CN118626518B - Data processing system, method, device, storage medium and program product - Google Patents
Data processing system, method, device, storage medium and program product Download PDFInfo
- Publication number
- CN118626518B CN118626518B CN202411097229.1A CN202411097229A CN118626518B CN 118626518 B CN118626518 B CN 118626518B CN 202411097229 A CN202411097229 A CN 202411097229A CN 118626518 B CN118626518 B CN 118626518B
- Authority
- CN
- China
- Prior art keywords
- data
- alarm
- data processing
- processing platform
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides a data processing system, a method, equipment, a storage medium and a program product, wherein the data processing system comprises a stream data processing platform, a big data processing platform, a cache database, a relational database and a distributed file storage system, wherein the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform, and the stream data processing platform is used for receiving original telemetry data reported by target equipment, storing real-time data and query data extracted from the original telemetry data into the cache database, storing the query data into the relational database and storing the original telemetry data into the distributed file storage system in a file form. By the system, the speed of data processing and response generated by the target equipment can be effectively improved.
Description
Technical Field
The present application relates to the field of data processing technology, and in particular, to a data processing system, a method, a device, a storage medium, and a program product.
Background
During operation of some devices, data relating to the operation of the device is generated, which data plays a vital role in maintaining safe, stable and efficient operation of the device.
In the related art, data generated in the running process of the device can be stored and processed through a database management system. However, if the data volume generated in the running process of the equipment is huge, the processing and response speed of the database management system to the data is poor.
Disclosure of Invention
The embodiment of the application provides a data processing system, a method, equipment, a storage medium and a program product, which can effectively improve the processing and response speed of data.
In a first aspect, an embodiment of the present application provides a data processing system, where the data processing system includes a stream data processing platform, a big data processing platform, a cache database, a relational database, and a distributed file storage system, where the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system, and the big data processing platform is also in communication connection with the distributed file storage system;
The stream data processing platform is used for receiving original telemetry data reported by the target equipment, storing real-time data and query data extracted from the original telemetry data into the cache database, storing the query data into the relational database and storing the original telemetry data into the distributed file storage system in a file form, wherein the real-time data is dynamically changed data in the running process of the target equipment;
The big data processing platform is used for responding to the data processing instruction of the stream data processing platform, carrying out statistical analysis processing on the original telemetry data stored in the distributed file storage system to obtain the statistical data of the target equipment, and storing the statistical data into the relational database.
In some embodiments, the stream data processing platform is in communication connection with the target device through an internet of things platform, and is specifically configured to:
Receiving a JSON message sent by the Internet of things platform;
according to a preset table structure, analyzing the original telemetry data from the JSON message;
extracting initial real-time data and initial query data from the original telemetry data, wherein the initial query data comprises at least one of fault alarm data, system data and version change data;
Performing de-duplication processing on the initial real-time data to obtain real-time data, obtaining the real-time data and storing the real-time data into the cache database;
Filtering and splitting the initial query data to obtain the query data, and storing the query data into the relational database in a key value pair mode;
Storing the original telemetry data in a columnar file storage format into the distributed file storage system.
In some embodiments, the stream data processing platform is specifically configured to:
Extracting initial real-time data in a time window from the original telemetry data;
partitioning the initial real-time data in the time window according to the equipment type;
Taking the latest data in each partition as the real-time data;
and storing the real-time data into a partition corresponding to the equipment identifier in the cache database in a mode of covering old data of the equipment identifier according to the equipment identifier corresponding to the real-time data.
In some embodiments, the raw telemetry data includes alert class data, the query data including alert data, N bits of the alert class data corresponding to a type of alert event;
the stream data processing platform is specifically configured to:
splitting the alarm data according to bits to obtain a plurality of alarm data, wherein one alarm data corresponds to one alarm event;
And writing the alarm data into an alarm record corresponding to the relational database, wherein the alarm record comprises a first byte and a second byte, the first byte is used for indicating whether an alarm event exists or not, and the second byte is used for indicating whether the alarm event ends or not.
In some embodiments, the stream data processing platform is specifically configured to:
acquiring an alarm value corresponding to the alarm data, and inquiring an alarm record matched with the alarm data from the relational database;
if the alarm record is inquired and the alarm record indicates that the alarm event corresponding to the alarm data is not ended under the condition that the alarm value indicates that the alarm event does not occur, updating the alarm value in the alarm record;
If the alarm value indicates that an alarm event occurs, if the alarm record is inquired and indicates that the alarm event corresponding to the alarm data is ended or the alarm record is not inquired, updating the alarm value in a new mode;
And under the condition that the alarm value indicates that an alarm event occurs, if the alarm record is inquired and the alarm record indicates that the alarm event corresponding to the alarm data is not ended, updating the alarm value in the alarm record.
In some embodiments, the target device is a device in an energy storage power station, the big data processing platform is further configured to:
acquiring operation data of an energy storage device of the energy storage power station from the distributed file storage system;
Predicting the running condition of the energy storage device by using the machine learning model to generate a prediction result;
Writing the prediction result into the relational database;
And/or outputting alarm information of the abnormality of the energy storage device when the prediction result indicates that the abnormality exists in the energy storage device.
In some embodiments, the data processing system further comprises a web application end which is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform;
the web application end is used for:
Acquiring real-time data from the cache database;
Acquiring query data from the relational database and/or the big data processing platform;
the raw telemetry data is obtained from the distributed file storage system.
In some embodiments, the distributed file storage system is further configured to receive raw telemetry data of the target device that is uploaded in bulk by the web application.
In some embodiments, the stream data processing platform is further configured to:
and the equipment pre-configuration program is used for defining the Internet of things equipment for receiving the original telemetry data in the Internet of things platform.
In a second aspect, an embodiment of the present application provides a data processing method, where the method is applied to a data processing system, where the data processing system includes a stream data processing platform, a big data processing platform, a cache database, a relational database, and a distributed file storage system, where the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system, and the big data processing platform is also in communication connection with the distributed file storage system;
the stream data processing platform is used for receiving original telemetry data reported by target equipment, extracting real-time data and query data from the original telemetry data, storing the real-time data into the cache database, storing the query data into the relational database and storing the original telemetry data into the distributed file storage system, wherein the real-time data is dynamically changed data in the running process of the target equipment;
The big data processing platform is used for responding to the data processing instruction of the stream data processing platform, carrying out statistical analysis processing on the original telemetry data stored in the distributed file storage system to obtain the statistical data of the target equipment, and storing the statistical data into the relational database.
In a third aspect, an embodiment of the present application provides a data processing apparatus, including a memory and a processor;
The memory is for storing computer instructions and the processor is for executing the computer instructions stored by the memory to implement the method of any one of the first aspects.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program for execution by a processor to perform the method of any one of the first aspects.
In a fifth aspect, the application provides a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the first aspects.
The data processing system comprises a stream data processing platform, a large data processing platform, a cache database, a relational database and a distributed file storage system, wherein the stream data processing platform is respectively in communication connection with the cache database, the relational database and the distributed file storage system, the large data processing platform is also in communication connection with the distributed file storage system, the stream data processing platform is used for receiving original telemetry data reported by target equipment, storing real-time data and query data extracted from the original telemetry data into the cache database, storing the query data into the relational database and storing the original telemetry data into the distributed file storage system, the real-time data is data which dynamically changes in the running process of the target equipment, and the large data processing platform is used for carrying out statistical processing on the original telemetry data stored in the distributed file storage system, and obtaining statistical data of the statistical data in the statistical analysis of the target equipment. By the system, the speed of data processing and response generated by the target equipment can be effectively improved, and the use requirement of a user is met.
Drawings
FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a process for storing real-time data in a relational database according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a process of writing alarm data into a relational database according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a data processing system according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of a data processing method according to an embodiment of the present application;
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In the embodiment of the application, the words "first", "second", etc. are used to distinguish identical items or similar items having substantially the same function and action, and the sequence thereof is not limited. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to denote examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The energy storage power station is a solution provided for solving the problems of renewable energy volatility, load demand peak value, stability and the like of the power system, realizes the supply and demand balance and stability of the power system while reducing carbon emission, provides stronger and more sustainable support for the future power system, and has great significance for the power industry and sustainable energy development strategy.
During operation of the energy storage power station, considerable data volumes are generated, which play a vital role in maintaining safe, stable and efficient operation of the power station. Therefore, how to collect, store and analyze the data generated during the operation of the energy storage power station is of great importance.
In the related art, data generated during the operation process of the energy storage power station device can be stored and processed through a Database management system (Database MANAGEMENT SYSTEM, DBMS), for example, the data of the energy storage power station is collected through a sensor, the collected data is stored in a traditional relational Database (such as MySQL and SQL SERVER), and the data is retrieved and analyzed for a user or a manager through an SQL statement query mode.
However, conventional relational databases have performance bottlenecks, such as query performance, memory pressure, etc., when the amount of data in a single table reaches tens of millions. The data volume of the energy storage power station on a single day can easily reach tens of millions, so that the traditional relational database obviously cannot provide enough performance and response speed, cannot deal with mass data to carry out filtering query and statistical analysis, and can cause lag in monitoring and management of the energy storage power station, and when fault alarm occurs in the energy storage power station, abnormal conditions cannot be processed in time, so that unnecessary loss is caused.
In view of this, embodiments of the present application provide a data processing system, method, apparatus, storage medium, and program product, which can provide sufficient performance and response speed by analyzing and processing received data and storing different data in different types of databases, and deploy a large data processing platform in the data processing system, so as to handle filtering query and statistical analysis of massive data.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be implemented independently or combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
FIG. 1 is a schematic diagram of a data processing system according to an embodiment of the present application, where the data processing system, as shown in FIG. 1, includes a stream data processing platform, a big data processing platform, a cache database, a relational database, and a distributed file storage system.
The stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform, and the big data processing platform is also in communication connection with the distributed file storage system.
In some embodiments, the streaming data processing platform is communicatively coupled to the target device.
In some embodiments, the stream data processing platform is communicatively connected to the target device through an internet of things platform.
In some embodiments, the target device may be a device in an energy storage power station, or other devices that may generate a large amount of data, which is not limited by embodiments of the present application. That is, the data processing system provided by the embodiment of the application not only can store and process the data of the energy storage power station, but also can store and process the data generated by other devices or systems (for example, power grid data, operator data and the like).
For ease of understanding, the following description will take the target device as a device of the energy storage power station, and the transmission of the original telemetry data through the internet of things platform as an example.
In some embodiments, the stream data processing platform is configured to receive raw telemetry data reported by the energy storage power station.
The energy storage power station is provided with data acquisition equipment, the data acquisition equipment can acquire data (original telemetry data) in the running process of each equipment in the energy storage power station through a sensor, and the original telemetry data is reported to a stream data processing platform in the data processing system through an internet of things platform (IoT Hub).
In a possible implementation manner, the data acquisition device may acquire the original telemetry data periodically with a preset time as a period, and send the acquired original telemetry data to the internet of things platform, and when the internet of things platform receives the original telemetry data, the data acquisition device may forward the original telemetry data to the stream data processing platform.
In another possible implementation manner, the internet of things platform may periodically send a data acquisition instruction to the data acquisition device with a preset time as a period, receive original telemetry data that is sent by the data acquisition device and is acquired based on the data acquisition instruction, and forward the original telemetry data to the stream data processing platform.
In some embodiments, when the internet of things platform receives the original telemetry data, the original telemetry data may be encapsulated in a target format (e.g., JSON format), and the encapsulated original telemetry data may be sent to the stream data processing platform in the form of a message (e.g., JSON message).
When receiving the JSON message sent by the internet of things platform, the stream data processing platform can analyze the JSON message according to a preset table structure, and extract the original telemetry data of the energy storage power station.
In some embodiments, after the stream data processing platform obtains the original telemetry data of the energy storage power station, the original telemetry data may be processed, for example, the stream data processing platform may store real-time data and query data extracted from the original telemetry data according to a preset data type, the real-time data is stored in the cache database, the query data is stored in the relational database, and the original telemetry data is stored in a file form in the distributed file storage system.
The real-time data are data which dynamically change in the running process of the equipment of the energy storage power station, such as equipment state, power generation amount, battery pack state and the like. The query data is some static information, such as user information, system data, fault alarm information, version change information and the like, in the running process of the equipment of the energy storage power station.
In some embodiments, the streaming data processing platform may extract initial real-time data and query initial query data from the raw telemetry data.
The stream data processing platform can perform de-duplication processing on the initial real-time data to obtain real-time data, and store the real-time data into the cache database. The stream data processing platform can adopt a preset automatic deduplication tool to deduplicate the initial real-time data, so as to obtain the real-time data. Through storing real-time data into the cache database, a user can monitor the storage power station in real time through the cache database, and the timeliness of monitoring the storage power station is effectively improved.
The stream data processing platform can filter and split the initial query data to obtain the query data, and store the query data into the relational database in a key value pair mode.
The stream data processing platform can perform data filtering on the initial query data, remove redundant information in the initial query data, improve the simplicity of the data, split the initial query data into different query data according to different type identifications, and store the different query data in the relational data in a key value pair mode.
The key-value pair (key) is a simple corresponding relation, the key (key) is used as an index of an element, the value (value) represents stored and read data, when the stream data processing platform obtains the query data, the stream data processing platform can use a device identifier corresponding to the query data as the key of the query data and use the query data as the corresponding value, and a user can quickly acquire the corresponding value (query data) through the key value later, so that the performance and response speed of the data processing system are improved efficiently.
The streaming data processing platform may store the raw telemetry data in a columnar file (e.g., parquet file format) storage format into the distributed file storage system. Optionally, the stream data processing platform may further compress the original telemetry data and store the compressed original telemetry data in the distributed file storage system in a column file storage format, so as to reduce occupation of storage space of the distributed file storage system. By storing the raw telemetry data in a distributed file storage system, the raw telemetry data is traceable and does not affect system performance due to the large amount of data.
Optionally, the stream data processing platform may further perform data cleansing on the original telemetry data when writing the original telemetry data into the distributed file storage system, thereby improving quality of the original telemetry data and reducing probability of error occurring when using the original telemetry data later.
In some embodiments, there are some valuable statistical analysis data (simply referred to as statistics) in the original telemetry data, for example, the number of charge-discharge cycles of the station and the statistical analysis of the charge-discharge quantity, and these statistical analysis data need to be obtained by performing some complex statistical analysis calculation on the original telemetry data, so that the user can quickly obtain these data, and the data processing system can process the original telemetry data through the big data processing platform to obtain the statistics.
The big data processing platform is used for carrying out statistical analysis processing on the original telemetry data stored in the distributed file storage system to obtain statistical data of the energy storage power station, and storing the statistical data into the relational database.
In some embodiments, a big data computing engine (for example, spark big data computing engine) is built in the big data processing platform, when the big data processing platform receives a data processing instruction sent by the stream data processing platform, original telemetry data corresponding to the data processing instruction can be read from the distributed file storage system, and the data processing instruction is executed by the big data computing engine to obtain the statistical data.
In some embodiments, a data processing program may be preset in the big data processing platform, a preset calculation program is executed by the big data calculation engine, corresponding original telemetry data is read from the distributed file storage system, and statistical analysis is performed on the original telemetry data to obtain the statistical data.
After the big data processing platform obtains the statistical data, the statistical data can be stored in the relational data in a key-value pair (key-value) mode, so that the user can conveniently and quickly inquire the statistical data.
According to the data processing system provided by the embodiment of the application, the original telemetry data of the energy storage power station is processed through the stream data processing platform, the real-time data is stored in the cache database, the query data is stored in the relational database, the original telemetry data is stored in the distributed file storage system, the statistics analysis is carried out on the original telemetry data through the big data processing platform, the statistics data is obtained and stored in the relational database, the performance and the response speed of the data processing system can be effectively improved, the requirements of filtering query and statistics analysis on massive data are met, and the monitoring capability and efficiency of the energy storage power station are improved.
The process of storing real-time data into a relational database by a streaming data processing platform is described in detail below with respect to the embodiment of FIG. 1.
Fig. 2 is a schematic process diagram of storing real-time data in a relational database by a stream data processing platform according to an embodiment of the present application, where, as shown in fig. 2, the process diagram includes:
S201, extracting initial real-time data in a time window from the original telemetry data.
In some embodiments, to prevent too frequent writing of real-time data to the cache database, which may result in too much server pressure on its servers, affecting the performance of the cache database, the streaming data processing platform may set a time window (e.g., by creating windowing function Tumbling Window ()) during processing, and extract the initial real-time data within the time window from the original telemetry data. For example, when the original telemetry data is data within 10 consecutive minutes and the time window set by the stream data processing platform is the last 1 minute, the stream data processing platform extracts the initial real-time data of the last 1 minute from the original telemetry data.
S202, partitioning the initial real-time data in the time window according to the device type.
In some embodiments, the real-time data of different devices may be different, and the addresses stored in the cache database may also be different, so the stream data processing platform may partition the acquired initial real-time data by device types, where each partition corresponds to one type of initial real-time data.
S203, taking the latest data in each partition as the real-time data.
In some embodiments, since the cache database caches real-time data of each device, the stream data processing platform may use the latest data in each partition as the real-time data, thereby obtaining real-time data with minimum data transmission cost, effectively reducing the number of write requests, and reducing the probability of downtime or processing delay of a server of the cache database.
S204, storing the real-time data into a partition corresponding to the equipment identifier in the cache database in a mode of covering old data of the equipment identifier according to the equipment identifier corresponding to the real-time data.
In some embodiments, when the real-time data of each device is written into the cache database by the stream data processing platform, the device identifier of each device may be used as the unique ID written into the cache database, and when the real-time data of each device is written into the cache database, if the data corresponding to the unique ID already exists in the cache database (i.e., old data exists), the stream data processing platform may store the real-time data into the partition corresponding to the device identifier in the cache database in a manner of covering the old data of the device identifier. If the data corresponding to the unique ID does not exist in the cache database (i.e., no old data exists), the stream data processing platform may create a new data partition in the cache database, and store the real-time data in the new data partition in a newly added manner. By taking the created unique id as the Key of the cache, the data item with the same id can be covered every time, and the instantaneity and the uniqueness of the data cached by the current device can be ensured.
In some embodiments, the original telemetry data reported by the energy storage power station includes alarm data, and when the stream data processing platform extracts query data including the alarm data from the telemetry data, since N bits (for example, 1 bit) in the alarm data correspond to one type of alarm event, when the stream data processing platform writes the alarm data into the relational database, splitting processing is required for the alarm data, and this process is described below with reference to fig. 3.
Fig. 3 is a schematic diagram of a process of writing alarm data into a relational database according to an embodiment of the present application, as shown in fig. 3, including:
s301, splitting the alarm data according to bits to obtain a plurality of alarm data, wherein one alarm data corresponds to one alarm event.
Illustratively, the alert class data includes 8 bits (bits), each byte corresponding to a type of alert for the device. For example, the alert class data is 00110101.
The stream data processing platform splits the alarm class data according to bits to obtain a plurality of alarm data, for example, splits the alarm class data 00110101 into 8 alarm data, and for the value of the first bit is 0, it may indicate that the alarm event of the type is not sent or that the alarm event of the type occurs. The following description will take an example in which 0 indicates that this type of alarm event is not transmitted.
It should be understood that the alarm type corresponding to each bit is predefined in the stream data processing platform.
Optionally, the stream data processing platform further performs reorganization processing on the 8 pieces of alarm data, for example, adds a device identifier to obtain 8 pieces of alarm data.
S302, updating the alarm data into an alarm record corresponding to the relational database, wherein the alarm record comprises a first byte and a second byte, the first byte is used for indicating whether an alarm event exists or not, and the second byte is used for indicating whether the alarm event is ended or not.
In some embodiments, the data contained in the alarm record includes a site identification, a device type, a fault start flag, a fault start time, a fault end flag, a fault end time, and the like of the energy storage power station.
When the stream data processing platform acquires a plurality of alarm data, the alarm data can be updated into corresponding alarm records according to the equipment identification and the alarm type. The alarm record includes a first byte (bitValue, failure start flag) and a second byte (bitValue, failure end flag).
Wherein bitValue has a value of 1 or null (null), bitValue has a value of 1 indicating that the current device has a fault of the type, bitValue has a value of 0 or null (null), and bitValue has a value of 0 indicating that the fault of the type has disappeared.
In some embodiments, the streaming data processing platform may write alert data into the alert record as follows:
S3021, obtaining alarm values corresponding to the alarm data, and inquiring alarm records matched with the alarm data from the relational database.
Wherein, the alarm value corresponding to the alarm data is the value of the corresponding bit (bitValue). The alert record for which the alert data matches is the value of the most recent first byte and second byte of that type of device.
S3022, if the alarm value indicates that no alarm event occurs, if the alarm record is queried and the alarm record indicates that the alarm event corresponding to the alarm data is not ended, updating the alarm value in the alarm record.
For any alarm data, if bitValue is 0, it indicates that the current equipment has no fault of the current type, if the alarm record is queried, and the corresponding alarm record is bitValue with a value of 1 and the value of bitValue2 is null (the alarm record indicates that the alarm event corresponding to the alarm data is not finished), the alarm value is written in the alarm record.
Exemplary alert records are as follows:
bitValue1 (1), bitValue2 (), (queried alert records) bitValue (1), bitValue (0), and (updated alert values).
Or alternatively
bitValue1(1,null);
bitValue2(null,0);
BitValue1 and bitValue, the first value is not queried, and the later value is not written with an alarm value.
Namely, the stream data processing platform updates the alarm data to the corresponding fault alarm record in an updating mode to indicate that the current fault has disappeared, and the current fault alarm record is a piece of complete record data.
S3023, if the alarm value indicates that an alarm event occurs, if the alarm record is queried, and the alarm record indicates that the alarm event corresponding to the alarm data is ended, or the alarm record is not queried, updating the alarm value in a new mode.
For any alarm data, if bitValue is 1, it indicates that the current equipment has a fault of the current type, if no alarm record is queried or the alarm record is queried, and the alarm record indicates that an alarm event corresponding to the alarm data is finished (the corresponding alarm record is bitValue with a value of empty, bitValue with a value of 0, or bitValue with a value of 1, and a value of bitvalue2 of 0), which indicates that the equipment has an alarm newly, the alarm value is written in the alarm record in a newly added mode.
Illustratively, the alert record is as follows:
bitValue1 (), bitValue2 (0); (alarm record of query)
BitValue1 (1), bitValue2 () (updating alert values).
Or alternatively;
bitValue1(null);
bitValue2(0);
bitValue1(1);
bitValue2(null);
Wherein the first two bitValue (null) and bitValue2 (0) are values of the query and the second two bitValue (1) and bitValue (null) are newly added values.
S3024, if the alarm value indicates that an alarm event occurs, if the alarm record is queried and the alarm record indicates that the alarm event corresponding to the alarm data is not ended, updating the alarm value in the alarm record.
For any alarm data, if bitValue is 1, it indicates that the current equipment has a fault of the current type, if the alarm record is queried, and the alarm record indicates that the alarm event corresponding to the alarm data is not ended (the corresponding alarm record is bitValue with 1 and the value of bitValue2 is null), which indicates that the equipment is continuously generating an alarm, and the alarm value is updated in the alarm record.
Exemplary alert records are as follows:
bitValue1 (1), bitValue2 (), (queried alert records) bitValue (1), bitValue2 (), (updated alert values).
Or alternatively
bitValue1(1,1);
bitValue2(null,null)。
Optionally, when the alarm start identifier and/or the alarm end identifier in the alarm record are updated, the time corresponding to the alarm start identifier and/or the alarm end identifier may also be updated at the same time.
Optionally, for any alarm data, if bitValue is 1, it indicates that the current device has a fault of the current type, if the alarm record is queried, and the alarm record indicates that the alarm event corresponding to the alarm data is not ended (the corresponding alarm record is bitValue with a value of 1 and the value of bitvalue2 is null), which indicates that the device is continuously generating an alarm, the stream data processing platform may not update the alarm record, so that the user can easily query the first generation time of the alarm.
In the embodiment of the application, the stream data processing platform processes the fault alarm data by creating the storage process, so that the same fault or alarm data is corresponding to the occurrence record and the disappearance record, thereby simplifying the operation of fault alarm closed loop inquiry, saving the disk space of a database and improving the inquiry speed of the fault alarm data.
On the basis of the above embodiments, the data processing system provided by the embodiment of the present application further includes a web application end.
FIG. 4 is a schematic structural diagram of a data processing system according to an embodiment of the present application, where, as shown in FIG. 4, the data processing system further includes a web application end, where the web application end is respectively in communication connection with the cache database, the relational database, the distributed file storage system, and the big data processing platform.
In some embodiments, the web application obtains real-time data from the cache database. For example, the web application may execute a predefined query command to obtain real-time data from the cache database, and display the real-time data, so as to monitor the operation condition of the energy storage power station in real time.
Optionally, the cache database may further process the real-time data, for example, process the real-time data into a preset format, so that the web application end can display the real-time data conveniently, or perform operations such as filtering the real-time data.
In some embodiments, the web application obtains query data from the relational database and/or the big data processing platform. For example, the web application may execute a predefined query statement to obtain the query data from the relational database and present the query data to the user. Or when the big data processing platform acquires the query data, the big data processing platform sends the query data to the web application end when the query data is stored in the relational database.
In some embodiments, the web application obtains the raw telemetry data from the distributed file storage system. The distributed file storage system is preset with a data interface, and the web application end can directly acquire the original telemetry data from the distributed file storage system through the data interface. Or the web application side can acquire the original telemetry data from the distributed file storage system through a big data processing platform.
In some embodiments, the part of the energy storage power station may be an offline device, which is not connected to the platform of the internet of things, and the original telemetry data (referred to as offline data) of the part of the energy storage power station may be uploaded in batches by the web application end and stored in the distributed file system.
In some embodiments, to further enhance monitoring of the operation of the energy storage power station, to ensure proper operation of the energy storage power station, the big data processing platform may predict the state of an energy storage device (e.g., a battery) in the energy storage power station.
The large data processing platform obtains operation data of an energy storage device of the energy storage power station from the distributed file storage system, predicts the operation condition of the energy storage device by using the machine learning model to generate a prediction result, writes the prediction result into the relational database, and/or outputs alarm information of the abnormality of the energy storage device when the prediction result indicates the abnormality of the energy storage device.
The large data processing platform is internally provided with a machine learning model which is trained in advance, and when the large data processing platform obtains the operation data of the energy storage device of the energy storage power station from the distributed file system, the operation data can be input into the machine learning model, and a prediction result of the machine learning model for predicting the operation condition of the energy storage device is obtained. The big data processing platform can write the prediction result into the relational database and/or send the prediction result to the web application end.
The big data processing platform can also judge the prediction result, identify whether the prediction result indicates that the energy storage device is abnormal, and if the big data processing platform identifies that the energy storage device is abnormal according to the prediction result, generate abnormal alarm information and output the abnormal alarm information of the energy storage device to the web application end.
In some embodiments, when the device in the energy storage power station sends the original telemetry data to the internet of things platform, the internet of things device for receiving the original telemetry data of the device corresponding to the energy storage power station needs to be defined in the internet of things platform, and because the energy storage power station is large in scale and numerous in devices, the internet of things device cannot be defined one by one. Therefore, the stream data processing platform may also send a device provisioning program (e.g., device Provisioning Service, DPS program) to the Internet of things platform, the device provisioning program being for defining, in the Internet of things platform, the Internet of things device for receiving raw telemetry data of a device of a corresponding energy storage power station.
In some embodiments, when the streaming data processing platform stores the original telemetry data to the distributed file storage system, a file format type parquet is created and the maximum number of data lines that the file can write and the maximum write time are set to ensure that the file size is within a controllable range.
In some embodiments, the stream data processing platform includes a plurality of data processing nodes, and when the stream data processing platform receives the original telemetry data, the stream data processing platform may split the original telemetry data into a plurality of sub-data (for example, split by a device type) and distribute the sub-data to the plurality of data processing nodes for processing.
In some embodiments, the streaming data processing platform may employ a policy of load balancing when distributing the plurality of sub-data to the plurality of data processing nodes for processing. For example, if a certain data processing node has a small current computing task, some sub-data are distributed more, and if a certain data processing node has a large current computing task, sub-data are distributed more or not.
In summary, the data processing system provided by the embodiment of the application can store the real-time data of each device into the cache database, thus effectively improving the system performance and response speed, and the data generated by the large-scale and global distributed energy storage power station is necessarily massive, and the high value of the massive data is only real-time data part, so that the historical data is not required to be stored into the relational database to influence the performance. The method can easily cope with the problems of data field change and complex data nesting caused by different projects of the energy storage power station, can easily expand the data storage format no matter the stream data processing platform or the distributed cache, and greatly reduces the later maintenance cost of the projects.
On the basis of the embodiment, the embodiment of the application also provides a data processing method.
Fig. 5 is a schematic flow chart of a data processing method according to an embodiment of the present application, where the method is applied to a data processing system, and the system includes a stream data processing platform, a big data processing platform, a cache database, a relational database, and a distributed file storage system, where the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system, and the big data processing platform is also in communication connection with the distributed file storage system, as shown in fig. 5, and includes:
S501, the stream data processing platform receives original telemetry data reported by target equipment, real-time data and query data extracted from the original telemetry data are stored in the cache database, the query data are stored in the relational database, and the original telemetry data are stored in the distributed file storage system, wherein the real-time data are dynamically changed data in the equipment operation process of the energy storage power station.
S502, the big data processing platform responds to a data processing instruction of the stream data processing platform to perform statistical analysis processing on the original telemetry data stored in the distributed file storage system, so as to obtain statistical data of the target equipment, and the statistical data is stored in the relational database.
In some embodiments, the stream data processing platform is in communication connection with equipment of a plurality of energy storage power stations through an internet of things platform, receives a JSON message sent by the internet of things platform, analyzes original telemetry data of the energy storage power stations from the JSON message according to a preset table structure, extracts initial real-time data and initial query data from the original telemetry data, wherein the initial query data comprises at least one of fault alarm data, system data and version change data, obtains real-time data after performing deduplication processing on the initial real-time data, obtains the real-time data and stores the real-time data in the cache database, filters and splits the initial query data, obtains the query data, stores the query data in the relational database in a key value pair mode, and stores the original telemetry data in a distributed file storage system in a column file storage format.
In some embodiments, the stream data processing platform extracts initial real-time data in a time window from the original telemetry data, partitions the initial real-time data in the time window according to device types, takes latest data in each partition as the real-time data, and stores the real-time data into the partition corresponding to the device identifier in the cache database in a mode of covering old data of the device identifier according to the device identifier corresponding to the real-time data.
In some embodiments, the raw telemetry data includes alert class data, the query data including alert data, N bits of the alert class data corresponding to a type of alert event.
The stream data processing platform splits the alarm data according to bits to obtain a plurality of alarm data, one alarm data corresponds to one alarm event, the alarm data is written into an alarm record corresponding to the relational database, the alarm record comprises a first byte and a second byte, the first byte is used for indicating whether the alarm event exists or not, and the second byte is used for indicating whether the alarm event ends or not.
In some embodiments, the stream data processing platform obtains an alarm value corresponding to the alarm data, queries an alarm record matched with the alarm data from the relational database, updates the alarm value in the alarm record if the alarm record is queried and the alarm record indicates that the alarm event corresponding to the alarm data is not ended if the alarm value indicates that the alarm event is not occurring, updates the alarm value in the alarm record if the alarm record is queried and the alarm record indicates that the alarm event corresponding to the alarm data is ended or does not query the alarm record in a newly increased manner if the alarm value indicates that the alarm event corresponding to the alarm data is occurring, and updates the alarm value in the alarm record if the alarm value indicates that the alarm record is queried and the alarm event corresponding to the alarm data is not ended if the alarm record indicates that the alarm event corresponding to the alarm data is not ending.
In some embodiments, the target device is a device in an energy storage power station, the big data processing platform obtains operation data of an energy storage device of the energy storage power station from the distributed file storage system, predicts an operation condition of the energy storage device by using the machine learning model to generate a prediction result, writes the prediction result into the relational database, and/or outputs alarm information of abnormality of the energy storage device when the prediction result indicates that the abnormality exists in the energy storage device.
In some embodiments, the data processing system further comprises a web application end which is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform.
The web application end acquires real-time data from the cache database, acquires query data from the relational database and/or the big data processing platform, and acquires the original telemetry data from the distributed file storage system.
In some embodiments, the distributed file storage system receives raw telemetry data of devices of the energy storage power station that are uploaded in bulk by the web application.
In some embodiments, the stream data processing platform sends a device provisioning program to the internet of things platform, the device provisioning program being configured to define, in the internet of things platform, an internet of things device for receiving raw telemetry data for a device of a corresponding energy storage power station.
In some embodiments, the stream data processing platform includes a plurality of computing nodes, and after receiving the original telemetry data, the stream data processing platform segments the original telemetry data first, and distributes the segmented original telemetry data to the plurality of computing nodes for parallel processing in a load balancing manner, thereby realizing high concurrence and high throughput of stream data processing.
In some embodiments, the stream data processing platform performs logic optimization on the query operation by means of an internal optimizer during processing of the original telemetry data, reducing computational overhead and resource consumption. And secondly, the stream data processing platform manages state information in a memory, and intermediate calculation results are stored in the memory, so that frequent disk I/O operation is avoided, and the processing speed of stream data achieves a near real-time processing effect. In addition, the stream data processing platform can periodically create check points, save processing states and intermediate results, and can resume processing after node faults so as to ensure the consistency of data. Finally, the stream data processing platform can dynamically adjust resource allocation according to the load, so as to ensure the efficient operation of the system.
The specific implementation manner of each step in the data processing method of the energy storage power station device provided by the embodiment of the application can refer to the specific implementation manner of each component in the data processing system of the energy storage power station device in the above embodiment, and the principle and the technical effect are similar, and are not repeated here.
The embodiment of the application also provides a data processing device. The data processing apparatus may implement the implemented technical solution of any of the apparatuses (e.g., stream data processing platforms) in the data processing system described in any of the embodiments above.
Fig. 6 is a schematic structural diagram of a data processing device 60 according to an embodiment of the present application, and as shown in fig. 6, the electronic device may include a transceiver 601, a processor 602, and a memory 603.
Processor 602 executes computer-executable instructions stored in memory, causing processor 602 to perform the aspects of the embodiments described above. The processor 602 may be a general purpose processor including a central processing unit CPU, a network processor (network processor, NP), etc., or may be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.
The memory 603 is coupled to the processor 602 via a system bus and communicates with each other, the memory 603 being adapted to store computer program instructions.
The transceiver 601 may perform receiving data/instructions and transmitting data/instructions.
Optionally, the electronic device 60 may also include a communication interface 604 to enable communication interaction with external or internal devices, such as clients (e.g., cell phones, tablets) through the communication interface 603. In a specific implementation, if the communication interface 604, the memory 603 and the processor 602 are implemented independently, the communication interface 604, the memory 603 and the processor 602 may be connected to each other and perform communication with each other through a bus.
The system bus may be a peripheral component interconnect (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory).
Alternatively, in a specific implementation, if the communication interface 604, the memory 603 and the processor 602 are integrated on a chip, the communication interface 604, the memory 603 and the processor 602 may complete communication through an internal interface.
The embodiment of the application also provides a chip for running the instructions, and the chip is used for executing the technical scheme executed by any one of the devices in the data processing system in the embodiment.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements a technical scheme executed by any device in the data processing system, and the implementation principle and technical effect are similar, and are not repeated herein.
In one possible implementation, the computer readable medium may include random access Memory (Random Access Memory, RAM), read-Only Memory (ROM), compact disk (compact disc Read-Only Memory, CD-ROM) or other optical disk Memory, magnetic disk Memory or other magnetic storage device, or any other medium targeted for carrying or storing the desired program code in the form of instructions or data structures, and accessible by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (Digital Subscriber Line, DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes optical disc, laser disc, optical disc, digital versatile disc (DIGITAL VERSATILE DISC, DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The embodiment of the application also provides a computer program product, which comprises a computer program, when the computer program is executed by a processor, the computer program realizes the technical scheme executed by any device in the data processing system, and the implementation principle and the technical effect are similar, and are not repeated here.
In the specific implementation of the terminal device or the server, it should be understood that the Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: DIGITAL SIGNAL Processor, abbreviated as DSP), application-specific integrated circuits (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
Those skilled in the art will appreciate that all or part of the steps of any of the method embodiments described above may be accomplished by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium, which when executed, performs all or part of the steps of the method embodiments described above.
The technical solution of the present application may be stored in a computer readable storage medium if implemented in the form of software and sold or used as a product. With such understanding, all or part of the technical solution of the present application may be embodied in the form of a software product stored in a storage medium comprising a computer program or several instructions. The computer software product causes a computer device (which may be a personal computer, a server, a network device, or similar electronic device) to perform all or part of the steps of the methods described in embodiments of the application.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are alternative embodiments, and that the acts and modules referred to are not necessarily required for the present application.
It should be further noted that, although the steps in the flowchart are sequentially shown as indicated by arrows, the steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in the flowcharts may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order in which the sub-steps or stages are performed is not necessarily sequential, and may be performed in turn or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
It will be appreciated that the device embodiments described above are merely illustrative and that the device of the application may be implemented in other ways. For example, the division of the units/modules in the above embodiments is merely a logic function division, and there may be another division manner in actual implementation. For example, multiple units, modules, or components may be combined, or may be integrated into another system, or some features may be omitted or not performed.
In addition, each functional unit/module in each embodiment of the present application may be integrated into one unit/module, or each unit/module may exist alone physically, or two or more units/modules may be integrated together, unless otherwise specified. The integrated units/modules described above may be implemented either in hardware or in software program modules.
The integrated units/modules, if implemented in hardware, may be digital circuits, analog circuits, etc. Physical implementations of hardware structures include, but are not limited to, transistors, memristors, and the like. The processor may be any suitable hardware processor, such as CPU, GPU, FPGA, DSP and an ASIC, etc., unless otherwise specified. Unless otherwise indicated, the storage elements may be any suitable magnetic or magneto-optical storage medium, such as resistive Random Access Memory RRAM (Resistive Random Access Memory), dynamic Random Access Memory DRAM (Dynamic Random Access Memory), static Random Access Memory SRAM (Static Random-Access Memory), enhanced dynamic Random Access Memory EDRAM (ENHANCED DYNAMIC Random Access Memory), high-Bandwidth Memory HBM (High-Bandwidth Memory), hybrid storage cube HMC (Hybrid Memory Cube), etc.
The integrated units/modules may be stored in a computer readable memory if implemented in the form of software program modules and sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a memory, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. The Memory includes a U disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, etc. which can store the program codes.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments. The technical features of the foregoing embodiments may be arbitrarily combined, and for brevity, all of the possible combinations of the technical features of the foregoing embodiments are not described, however, all of the combinations of the technical features should be considered as being within the scope of the disclosure.
It should be noted that the above embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that the technical solution described in the above embodiments may be modified or some or all of the technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the scope of the technical solution of the embodiments of the present application.
Claims (12)
1. The data processing system is characterized by comprising a stream data processing platform, a big data processing platform, a cache database, a relational database and a distributed file storage system, wherein the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform, and the big data processing platform is also in communication connection with the distributed file storage system;
the stream data processing platform is used for receiving original telemetry data reported by target equipment, extracting real-time data and query data from the original telemetry data, storing the real-time data into the cache database, storing the query data into the relational database and storing the original telemetry data into the distributed file storage system, wherein the real-time data is dynamically changed data in the running process of the target equipment;
The big data processing platform is used for responding to the data processing instruction of the stream data processing platform, carrying out statistical analysis processing on the original telemetry data stored in the distributed file storage system to obtain the statistical data of the target equipment, and storing the statistical data into the relational database;
The original telemetry data comprises alarm data, the stream data processing platform is also used for splitting the alarm data according to bits to obtain a plurality of alarm data, one alarm data corresponds to one alarm event, the alarm data is updated into an alarm record corresponding to the relational database, the alarm record comprises a first byte and a second byte, the first byte is used for indicating whether the alarm event exists or not, the second byte is used for indicating whether the alarm event ends or not, 1 bit of the alarm data corresponds to one type of alarm event, the value of the first byte is 1 or null, and the value of the second byte is 0 or null;
the stream data processing platform is specifically configured to:
acquiring an alarm value corresponding to the alarm data, and inquiring an alarm record matched with the alarm data from the relational database;
if the alarm record is inquired and the alarm record indicates that the alarm event corresponding to the alarm data is not ended under the condition that the alarm value indicates that the alarm event does not occur, updating the alarm value in the alarm record;
If the alarm value indicates that an alarm event occurs, if the alarm record is inquired and indicates that the alarm event corresponding to the alarm data is ended or the alarm record is not inquired, updating the alarm value in a new mode;
And under the condition that the alarm value indicates that an alarm event occurs, if the alarm record is inquired and the alarm record indicates that the alarm event corresponding to the alarm data is not ended, updating the alarm value in the alarm record.
2. The system according to claim 1, wherein the stream data processing platform is communicatively connected to the target device through an internet of things platform, and is specifically configured to:
Receiving a JSON message sent by the Internet of things platform;
according to a preset table structure, analyzing the original telemetry data from the JSON message;
extracting initial real-time data and initial query data from the original telemetry data, wherein the initial query data comprises at least one of fault alarm data, system data and version change data;
performing de-duplication processing on the initial real-time data to obtain real-time data, and storing the real-time data into the cache database;
Filtering and splitting the initial query data to obtain the query data, and storing the query data into the relational database in a key value pair mode;
Storing the original telemetry data in a columnar file storage format into the distributed file storage system.
3. The system according to claim 2, characterized in that the stream data processing platform is specifically configured to:
Extracting initial real-time data in a time window from the original telemetry data;
partitioning the initial real-time data in the time window according to the equipment type;
Taking the latest data in each partition as the real-time data;
and storing the real-time data into a partition corresponding to the equipment identifier in the cache database in a mode of covering old data of the equipment identifier according to the equipment identifier corresponding to the real-time data.
4. A system according to any one of claims 1-3, wherein the target device is a device in an energy storage power station, the big data processing platform being further configured to:
acquiring operation data of an energy storage device of the energy storage power station from the distributed file storage system;
Predicting the running condition of the energy storage device by using a machine learning model to generate a prediction result;
Writing the prediction result into the relational database;
And/or outputting alarm information of the abnormality of the energy storage device when the prediction result indicates that the abnormality exists in the energy storage device.
5. The system of any of claims 1-3, wherein the data processing system further comprises a web application communicatively coupled to the cache database, the relational database, the distributed file storage system, and the big data processing platform, respectively;
the web application end is used for:
Acquiring real-time data from the cache database;
Acquiring query data from the relational database and/or the big data processing platform;
the raw telemetry data is obtained from the distributed file storage system.
6. The system of claim 5, wherein the system further comprises a controller configured to control the controller,
The distributed file storage system is also used for receiving the original telemetry data of the target equipment, which are uploaded in batches by the web application end.
7. A system according to claim 2 or 3, wherein the stream data processing platform is further adapted to:
and the equipment pre-configuration program is used for defining the Internet of things equipment for receiving the original telemetry data in the Internet of things platform.
8. A system according to any of claims 1-3, wherein the stream data processing platform further comprises a plurality of data processing nodes, the stream data processing platform further being adapted to:
Distributing the original telemetry data to the plurality of data processing nodes in a load balancing mode so that the plurality of data processing nodes extract the real-time data and the query data.
9. The data processing method is characterized by being applied to a data processing system, wherein the data processing system comprises a stream data processing platform, a big data processing platform, a cache database, a relational database and a distributed file storage system, wherein the stream data processing platform is respectively in communication connection with the cache database, the relational database, the distributed file storage system and the big data processing platform, and the big data processing platform is also in communication connection with the distributed file storage system;
The stream data processing platform receives original telemetry data reported by target equipment, extracts real-time data and query data from the original telemetry data, stores the real-time data into the cache database, stores the query data into the relational database and stores the original telemetry data into the distributed file storage system, wherein the real-time data is dynamically changed data in the running process of the target equipment;
The big data processing platform responds to the data processing instruction of the stream data processing platform, performs statistical analysis processing on the original telemetry data stored in the distributed file storage system to obtain statistical data of the target equipment, and stores the statistical data into the relational database;
the original telemetry data comprises alarm data, the stream data processing platform further carries out splitting processing on the alarm data according to bits to obtain a plurality of alarm data, one alarm data corresponds to one alarm event, the alarm data is updated into an alarm record corresponding to the relational database, the alarm record comprises a first byte and a second byte, the first byte is used for indicating whether the alarm event exists or not, the second byte is used for indicating whether the alarm event ends or not, 1 bit of the alarm data corresponds to one type of alarm event, the value of the first byte is 1 or null, and the value of the second byte is 0 or null;
The stream data processing platform acquires an alarm value corresponding to the alarm data, and queries an alarm record matched with the alarm data from the relational database;
If the alarm value indicates that the alarm event does not occur, if the alarm record is inquired and the alarm record indicates that the alarm event corresponding to the alarm data does not end, the alarm value is updated in the alarm record, if the alarm value indicates that the alarm event occurs and the alarm record indicates that the alarm event corresponding to the alarm data ends or the alarm record is not inquired, the alarm value is updated in a newly added mode, and if the alarm value indicates that the alarm event occurs and the alarm record indicates that the alarm event corresponding to the alarm data does not end, the alarm value is updated in the alarm record.
10. A data processing apparatus, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method as claimed in claim 9.
11. A computer readable storage medium, having stored thereon a computer program, the computer program being executed by a processor to implement the method of claim 9.
12. A computer program product comprising a computer program which, when executed by a controller, implements the method as claimed in claim 9.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411097229.1A CN118626518B (en) | 2024-08-12 | 2024-08-12 | Data processing system, method, device, storage medium and program product |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411097229.1A CN118626518B (en) | 2024-08-12 | 2024-08-12 | Data processing system, method, device, storage medium and program product |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118626518A CN118626518A (en) | 2024-09-10 |
| CN118626518B true CN118626518B (en) | 2025-06-17 |
Family
ID=92600237
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411097229.1A Active CN118626518B (en) | 2024-08-12 | 2024-08-12 | Data processing system, method, device, storage medium and program product |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118626518B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107590749A (en) * | 2017-09-07 | 2018-01-16 | 北京国电通网络技术有限公司 | A method and system for processing power distribution data |
| CN111432295A (en) * | 2020-03-18 | 2020-07-17 | 北京科东电力控制系统有限责任公司 | A master station system for collecting electricity information based on distributed technology |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TR201809901T4 (en) * | 2013-05-22 | 2018-07-23 | Striim Inc | Apparatus and method for sequential event processing in a dispersed environment. |
| EP3079116A1 (en) * | 2015-04-10 | 2016-10-12 | Tata Consultancy Services Limited | System and method for generating recommendations |
| CN106375113B (en) * | 2016-08-25 | 2020-01-17 | 新华三技术有限公司 | Method, device and system for recording faults of distributed equipment |
| CN116010703B (en) * | 2023-01-14 | 2024-02-20 | 北京天译科技有限公司 | Historical meteorological data query analysis system and method |
| CN117931502A (en) * | 2024-02-02 | 2024-04-26 | 江苏天合清特电气有限公司 | Alarm management method for optical storage integrated machine, optical storage integrated machine and computer storage medium |
-
2024
- 2024-08-12 CN CN202411097229.1A patent/CN118626518B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107590749A (en) * | 2017-09-07 | 2018-01-16 | 北京国电通网络技术有限公司 | A method and system for processing power distribution data |
| CN111432295A (en) * | 2020-03-18 | 2020-07-17 | 北京科东电力控制系统有限责任公司 | A master station system for collecting electricity information based on distributed technology |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118626518A (en) | 2024-09-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6280872B2 (en) | Distributed database with modular blocks and associated log files | |
| US9426219B1 (en) | Efficient multi-part upload for a data warehouse | |
| Lai et al. | Towards a framework for large-scale multimedia data storage and processing on Hadoop platform | |
| CN112988741A (en) | Real-time service data merging method and device and electronic equipment | |
| CN111258978A (en) | a method of data storage | |
| CN105117171A (en) | Energy SCADA massive data distributed processing system and method thereof | |
| CN118760724B (en) | Data management method, system, device and medium for data storage warehouse | |
| Mostafa et al. | SciTS: a benchmark for time-series databases in scientific experiments and industrial internet of things | |
| CN103617276A (en) | Method for storing distributed hierarchical RDF data | |
| CN106055678A (en) | Hadoop-based panoramic big data distributed storage method | |
| JP2021525907A (en) | Frequent pattern analysis of distributed systems | |
| CN114387124A (en) | Time sequence data storage method of nuclear power industry internet platform | |
| CN104281980B (en) | Thermal power generation unit remote diagnosis method and system based on Distributed Calculation | |
| CN118626518B (en) | Data processing system, method, device, storage medium and program product | |
| CN113326335A (en) | Data storage system, method, device, electronic equipment and computer storage medium | |
| CN110543496B (en) | Data processing method and device for time sequence database cluster | |
| CN118779337A (en) | An EFK log collection and analysis method for data aggregation | |
| CN111597201A (en) | A Fast Content Compression Method Based on Greenplum Large Scale Parallel Processing Database | |
| CN117216057A (en) | Method, device, medium and equipment for scrolling update of ES index | |
| CN116126238A (en) | Data storage method, system, device and nonvolatile storage medium | |
| Liu et al. | A versatile event-driven data model in hbase database for multi-source data of power grid | |
| CN112286995B (en) | Data analysis method, device, server, system and storage medium | |
| CN113010399A (en) | Log data processing method, system, device and medium | |
| Yan et al. | Big Data Storage and Analysis System for Space Application | |
| CN118673086B (en) | Data processing method, device, electronic device and computer readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |