CN114372027B - Data processing method, device, processor and electronic device - Google Patents
Data processing method, device, processor and electronic device Download PDFInfo
- Publication number
- CN114372027B CN114372027B CN202210017883.1A CN202210017883A CN114372027B CN 114372027 B CN114372027 B CN 114372027B CN 202210017883 A CN202210017883 A CN 202210017883A CN 114372027 B CN114372027 B CN 114372027B
- Authority
- CN
- China
- Prior art keywords
- file
- loaded
- fields
- field
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/119—Details of migration of file systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses a data processing method, a data processing device, a processor and electronic equipment. The cloud computing method comprises the steps of obtaining a file to be loaded and a configuration file, wherein the configuration file comprises a file name of the file to be loaded and a table name of a database table, a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the plurality of table fields, processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and storing the processed file fields in a corresponding database. The application solves the problems of poor stability and low efficiency of the distributed system in the related technology when large-batch data are loaded.
Description
Technical Field
The application relates to the field of cloud computing, in particular to a data processing method, a data processing device, a processor and electronic equipment.
Background
When the internet industry is rapidly developed, data sharing among enterprises, products and applications is also increasingly required to be transmitted. In the era of information volume storm, when a large amount of data files are loaded, stability of a database is ensured, and external online service is ensured not to influence performance due to data loading, so that external service experience is caused to be smooth. Meanwhile, on an open platform system, database systems contend have certain isomerism among databases, so that a data file loading scheme cannot be compatible with various database systems, and excessive system migration workload can be brought by replacing or butting the database systems.
Aiming at the problems of poor stability and low efficiency of a distributed system in the related technology when large-batch data are loaded, no effective solution is proposed at present.
Disclosure of Invention
The application mainly aims to provide a data processing method, a device, a processor and electronic equipment, so as to solve the problems of poor stability and low efficiency of a distributed system in the related technology when a large amount of data is loaded.
In order to achieve the above object, according to one aspect of the present application, there is provided a data processing method, including obtaining a file to be loaded, and a configuration file, where the configuration file includes a file name of the file to be loaded, and table names of a database table, a plurality of table fields in the database table respectively correspond to file positions of the file to be loaded, cutting the file to be loaded according to the configuration file, determining a plurality of file fields corresponding to the plurality of table fields, processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and storing the processed file fields in a corresponding database.
Optionally, before clipping the file to be loaded according to the configuration file and determining a plurality of file fields corresponding to the plurality of table fields, the method further comprises the steps of obtaining a file list of the file to be loaded, analyzing the configuration file, determining the file name of the file to be loaded, determining a database table name, wherein the plurality of table fields in the database table correspond to the file positions of the file to be loaded respectively, traversing the file list, and checking the plurality of files to be loaded.
Optionally, traversing the file list, checking a plurality of files to be loaded includes traversing the file list, reading the length of a file row of the files to be loaded, determining the length of a file row corresponding to a table field of the database table according to the configuration file, executing the steps of cutting the files to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields when the length of the read file row is consistent with the length of the file row corresponding to the database table, skipping the file row of the files to be loaded when the length of the read file row is inconsistent with the length of the file row corresponding to the database table, checking the subsequent file row, and logging the skipped file row.
Optionally, the file name of the file to be loaded and the database table name correspond to the file position of the file to be loaded respectively, wherein the plurality of table fields in the database table comprise clipping the content data in the file position of the file to be loaded according to the file position corresponding to the table field, setting up a corresponding relation with the table field by taking the clipped content data as the file field when the clipped content data is not empty, and storing the clipped content data, and setting up a corresponding relation with the table field by taking preset default content data as the file field when the clipped content data is empty, and storing the clipped content data.
Optionally, processing the file field corresponding to the table field according to the processing function corresponding to the table field includes determining the processing function corresponding to the table field in the corresponding relation according to the corresponding relation, where the processing function corresponding to the table field is preset in the configuration file, and processing the file field corresponding to the corresponding relation through the processing function.
Optionally, storing the processed file field in a corresponding database includes processing the file field locally through the built-in function if the processing function is a built-in function, and transmitting the file field to the database if the processing function is a non-built-in function, wherein the database processes the file field through the non-built-in function before loading the file field.
Optionally, storing the processed file field in a corresponding database includes determining a loading script corresponding to the database according to the processed file field, and storing the file field in the database by running the loading script.
Optionally, the plurality of databases are provided, and before the file fields corresponding to the table fields are processed according to the processing functions corresponding to the table fields, the method further comprises the steps of determining the database identifications stored in the file fields according to the identification fields in the file to be loaded, wherein the identification fields are used for identifying the databases corresponding to the file fields in the file to be loaded, and determining the database identifications stored in the file fields according to the identification fields and the file fields identified by the identification fields.
Optionally, before storing the processed file fields in the corresponding databases, the method further comprises recording the number of file rows of the files to be loaded after processing, and storing the processed file fields in the files to be loaded in the corresponding databases when the number of file rows reaches a preset number.
In order to achieve the above object, according to another aspect of the present application, there is provided a data processing apparatus, including an obtaining module configured to obtain a file to be loaded, and a configuration file, where the configuration file includes a file name of the file to be loaded, and table names of a database table, a plurality of table fields in the database table respectively correspond to file positions of the file to be loaded, a clipping module configured to clip the file to be loaded according to the configuration file, determine a plurality of file fields corresponding to the plurality of table fields, a processing module configured to process the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and a storage module configured to store the processed file fields in a corresponding database.
In order to achieve the above object, according to another aspect of the present application, there is provided a processor for executing a program, wherein the program executes the data processing method of any one of the above.
In order to achieve the above object, according to another aspect of the present application, there is provided an electronic device including one or more processors and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any one of the data processing methods described above.
The method comprises the steps of obtaining a file to be loaded and a configuration file, wherein the configuration file comprises a file name of the file to be loaded and a table name of a database table, a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the table fields, processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and storing the processed file fields in a corresponding database. The relation between the file to be loaded and the table field of the database table is configured through the configuration file, the file to be loaded is loaded into the database after being processed according to the processing function corresponding to the table field, and the problems of poor stability and low efficiency of the distributed system in the related technology when a large amount of data are loaded are solved. And the effects of stably and rapidly loading a large number of data files, having small fluctuation of the performance of the database and improving the stability and loading efficiency of the distributed system in the loading of the large number of data are achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a flow chart of a data processing method provided according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a batch file loading system provided in accordance with an embodiment of the present application;
FIG. 3 is a flow chart of batch file loading provided in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram of a data processing apparatus provided in accordance with an embodiment of the present application;
Fig. 5 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the application herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The present application will be described with reference to preferred implementation steps, and fig. 1 is a flowchart of a data processing method according to an embodiment of the present application, and as shown in fig. 1, the method includes the following steps:
step S101, obtaining a file to be loaded and a configuration file, wherein the configuration file comprises a file name of the file to be loaded and a table name of a database table, and a plurality of table fields in the database table correspond to the file positions of the file to be loaded respectively;
Step S102, cutting a file to be loaded according to a configuration file, and determining a plurality of file fields corresponding to a plurality of table fields;
step S103, according to the processing function corresponding to the table field, processing the file field corresponding to the table field;
Step S104, the processed file fields are stored in a corresponding database.
The execution main body of the steps can be a processor or a controller, the files to be loaded and the configuration files are obtained through the steps, the configuration files comprise file names of the files to be loaded and table names of database tables, a plurality of table fields in the database tables correspond to file positions of the files to be loaded respectively, the files to be loaded are cut according to the configuration files, a plurality of file fields corresponding to the table fields are determined, the file fields corresponding to the table fields are processed according to a processing function corresponding to the table fields, and the processed file fields are stored in a corresponding database. The relation between the file to be loaded and the table field of the database table is configured through the configuration file, the file to be loaded is loaded into the database after being processed according to the processing function corresponding to the table field, and the problems of poor stability and low efficiency of the distributed system in the related technology when a large amount of data are loaded are solved. And the effects of stably and rapidly loading a large number of data files, having small fluctuation of the performance of the database and improving the stability and loading efficiency of the distributed system in the loading of the large number of data are achieved.
The files to be loaded can be files for data migration between different database systems in the distributed system. In the related art, the database systems have different versions, and the structures of the database systems of different developers are different, so that file migration with larger data volume exists when data migration or system replacement is performed among different database systems, and the database is often unstable due to larger data volume during loading, so that the performance of the database is affected. The embodiment provides a loading mode, the loading files are loaded through the configuration files and the database table, the processing of the files to be loaded can be directly matched through the database table, then the loading is carried out, the problems that resources of the database are occupied and the performance of the database is affected due to the fact that the processing is carried out after the files to be loaded are loaded into the database in the prior art are avoided, and the stability and the loading efficiency of the files to be loaded are improved to a certain extent.
The configuration file comprises a file name of the file to be loaded and table names of a database table, wherein a plurality of table fields in the database table respectively correspond to the file positions of the file to be loaded, and the configuration file can also comprise default values of each table field, namely, when the table field corresponds to the file position of the file to be loaded, and the corresponding file content is determined, if the corresponding file content is empty, the default values are used for replacing, so that the condition that the table fields do not have corresponding contents and the interference to other corresponding relations is easily caused is avoided. The configuration file may further include a processing mode corresponding to each table field, so that file contents corresponding to the table field are processed according to a preset processing mode.
The database table comprises a plurality of table fields, and each table field corresponds to a file position of a file to be loaded. It should be noted that, the file to be loaded includes a plurality of rows, each table field corresponds to a position of a certain row, and since the file to be loaded stores data in units of rows, one table field only corresponds to file contents from a first position to a second position in one row, and file contents corresponding to positions of a plurality of rows do not exist in one table field.
The file to be loaded is cut according to the configuration file, a plurality of file fields corresponding to a plurality of table fields are determined, and then the file fields corresponding to the table fields can be stored in the position of a database table, or the storage position of the file fields is stored through the table fields, so that the file fields can be processed and loaded later.
The file fields corresponding to the table fields are processed according to the processing functions corresponding to the table fields, and the processing functions can be various functions for processing the fields, such as a sum function, a rounding function and the like. The processing functions may be divided into built-in functions and non-built-in functions of the database table, and the non-built-in functions are processing functions provided for services other than the database table. The processing functions corresponding to the file fields corresponding to the plurality of table fields may be identical.
After processing by the processing function, the processed file fields are stored in the corresponding databases, and the file fields comprise fields for identifying the databases, so that the databases corresponding to the file fields can be determined through the fields. The relation between the file to be loaded and the table field of the database table is configured through the configuration file, the file to be loaded is loaded into the database after being processed according to the processing function corresponding to the table field, and the problems of poor stability and low efficiency of the distributed system in the related technology when a large amount of data are loaded are solved. And the effects of stably and rapidly loading a large number of data files, having small fluctuation of the performance of the database and improving the stability and loading efficiency of the distributed system in the loading of the large number of data are achieved.
Optionally, before cutting the file to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields, the method further comprises the steps of obtaining a file list of the file to be loaded, analyzing the configuration file, determining the file name of the file to be loaded and the table name of a database, wherein the table fields in the database table correspond to the file positions of the file to be loaded respectively, traversing the file list, and checking the file to be loaded.
The file list may be a file list determined for the loading requirement, and includes file names of a plurality of files to be loaded. And analyzing the configuration file to obtain the file name of the file to be loaded corresponding to the configuration file, namely the file to be loaded corresponding to the configuration file adopts the configuration file. And the database table names, namely database tables used by the files to be loaded corresponding to the configuration files in loading, are set, and a plurality of table fields in the database tables correspond to the file positions of the files to be loaded respectively.
Optionally, traversing the file list and checking the files to be loaded includes traversing the file list and reading the length of the file row of the files to be loaded, determining the length of the file row corresponding to the table field of the database table according to the configuration file, executing the steps of cutting the files to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields when the length of the read file row is consistent with the length of the file row corresponding to the database table, skipping the file row of the files to be loaded when the length of the read file row is inconsistent with the length of the file row corresponding to the database table, checking the subsequent file row, and logging the skipped file row.
Because the table field of the database table corresponds to the file name of the file to be loaded of a row, before the file field of the file to be loaded corresponding to the table field is acquired, the length of the file row set by the file of the row in the database table can be determined through a plurality of table fields corresponding to the same row, and compared with the length of the file row for reading the file to be loaded, if the lengths are consistent, it is indicated that the file to be loaded has no problem, cutting can be performed, if the lengths are inconsistent, it is indicated that errors may exist in the file field of the file to be standby, log registration can be performed, or error reporting is performed. Thereby ensuring the reading accuracy of the file to be loaded and further improving the loading accuracy of the file to be loaded.
The method comprises the steps of selecting a file name of a file to be loaded and a database table name, wherein a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting content data in the file positions of the file to be loaded according to the file positions corresponding to the table fields, setting up corresponding relations with table fields by taking the cut content data as the file fields when the cut content data is not empty, storing the cut content data, setting up corresponding relations with the table fields by taking preset default content data as the file fields when the cut content data is empty, and storing the cut content data.
And in order to prevent the stability and accuracy of the corresponding relation between the table field and the file field of the file to be loaded, filling the file position without the file field by a default value and establishing the corresponding relation with the table field. When the processed file fields are loaded into the database, the default fields can be said to be deleted so as to accurately load the files to be loaded on the machine and ensure the consistency of the files to be loaded before and after the loading. The loading accuracy and stability of the files to be loaded are further improved.
Optionally, processing the file field corresponding to the table field according to the processing function corresponding to the table field includes determining the processing function corresponding to the table field in the corresponding relation according to the corresponding relation, wherein the processing function corresponding to the table field is preset in the configuration file, and processing the file field corresponding to the corresponding relation through the processing function.
The processing function corresponding to the table field may be included in the configuration file, and the processing function corresponding to the table field of the database table may be determined by parsing the configuration file. The processing function corresponding to the table field can be preset, and can be modified according to the requirement in the using process.
Optionally, storing the processed file field in a corresponding database includes processing the file field locally through the built-in function if the processing function is a built-in function, and transmitting the file field to the database if the processing function is a non-built-in function, wherein the database processes the file field through the non-built-in function before loading the file field.
When the processing function is a built-in function, that is, the processing function of the database table, the file field corresponding to the table field can be processed in the database table, so that the file field is prevented from being processed in the process of loading the file, and more system resources are occupied. For the processing functions which are not included in the database table, the file fields need to be processed in the process of loading the file fields into the database, and then the processed file fields are loaded into the corresponding database.
Optionally, storing the processed file fields in the corresponding databases includes determining a loading script corresponding to the databases according to the processed file fields, and storing the file fields in the databases by running the loading script.
The database may be a distributed database, for example, a MySQL database, the loading script may be an SQL script required for loading the distributed database, in one embodiment, the processed file fields are assembled into different batches of SQL scripts according to the difference of the loading databases, for example, the MySQL database may use a script format that is inserted when the database record does not exist, and the original record is updated when the database record exists :INSERT INTO TABLENAME(A,B)VALUES('VALUEA','VALUEB'),('VALUEC','VALUED')ON DUPLICATE KEY UPDATE A=VALUES(A),B=VALUES(B).
Optionally, the plurality of databases are arranged, and before the file fields corresponding to the table fields are processed according to the processing functions corresponding to the table fields, the method further comprises the steps of determining the database identifications stored in the file fields according to the identification fields in the file to be loaded, wherein the identification fields are used for identifying the databases corresponding to the file fields in the file to be loaded, and determining the database identifications stored in the file fields according to the identification fields and the file fields identified by the identification fields.
The identification field can determine the database corresponding to the file field in the file to be loaded, so that a database identification is generated to identify the file field, and the database corresponding to the file field is determined, thereby facilitating subsequent loading.
Optionally, before the processed file fields are stored in the corresponding databases, the method further comprises recording the number of file lines of the files to be loaded after processing, and storing the processed file fields in the files to be loaded in the corresponding databases when the number of file lines reaches the preset number.
In order to improve the loading efficiency of batch files, when the files to be loaded are corresponding to the database table, statistics is carried out on file rows corresponding to the database table, and under the condition that the number reaches a preset number, the file quantity of file fields to be loaded is indicated to reach a certain level, and loading is blocked or failed due to the fact that the upper limit of the loading performance of the system exists, so that under the condition that the number of file rows reaches the preset number, the file fields corresponding to the table fields of the database table are processed and loaded, the efficient loading is ensured, and the loading accuracy and stability of the files to be loaded are improved.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
This example also provides an alternative embodiment, which is described in detail below.
The present embodiments address the shortcomings and drawbacks of the prior art. The large-batch data file loading device which is stable, has small influence on the performance of the database, does not need to additionally install a database client and is compatible with heterogeneous databases is provided.
Fig. 2 is a schematic diagram of a batch file loading system according to an embodiment of the present application, and as shown in fig. 2, the main idea of the present application is to analyze and split a large file record through the correspondence between a configured file and a table structure, reorganize a small batch of loading scripts, submit the small batch of loading scripts to a database for loading, and reduce the database transaction of loading.
The device is developed by Java, and finally packaged into Jar packets for a user to use. The user can introduce the Jar package directly or through Maven dependence without installing an additional database client.
After the device is introduced, the corresponding relation and conversion relation of the configuration file and the data table field are mainly configured as follows:
1. file names to be loaded and database table names.
2. For each table field, a starting position in the file record row.
3. Default values for each table field, if the file value contents are not available, the database is loaded with default values.
4. The data processing mode of each table field can be configured, if necessary, such as right-hand space removal of RTRIM or assembly of several fields in a file.
FIG. 3 is a flowchart of batch file loading according to an embodiment of the present application, as shown in FIG. 3, after the device is started, batch files are loaded according to the flow shown in FIG. 3.
1. Analyzing the configuration file, and storing configuration information of the configuration file for use in subsequent steps.
2. And acquiring a file list according to the file name to be loaded.
3. And circularly reading records in each file, and checking whether the length of a record row is consistent with that of the configuration file according to the configuration information.
4. If the lengths are inconsistent, the logs are registered, and the records are skipped. If the two fields are consistent, cutting the record according to the configuration information, cutting the record into the content of each field, and establishing a corresponding relation in the fields in the configuration information.
5. And processing the field content according to the default value and the processing function of each field in the configuration information.
6. If the processing function is a self-defined built-in function, the processing is completed in the local Java program, and the processing resource of the database is saved. If the processing function is not a custom built-in function, processing is performed when the data is loaded by the database.
7. The processed field values are assembled into different batches of SQL scripts, such as MYSQL database, according to different loading databases, the following script format can be used, the database records are inserted if not present, and the original records are updated if present :INSERT INTO TABLENAME(A,B)VALUES('VALUEA','VALUEB'),('VALUEC','VALUED')ON DUPLICATE KEY UPDATE A=VALUES(A),B=VALUES(B).
8. To ensure that small transactions are used, every 500 records are processed, the database is committed for loading.
9. And processing records of all files to be loaded by circulating in this way.
Compared with the traditional database data loading scheme, the device has the following advantages:
The size of the transaction is controlled by the application side of the first and the traditional databases, so that the control is not tight, large transactions are easy to be initiated, and the database is influenced. The device uses 500 small transactions to load data, and compared with the small transactions, the device has less influence on the performance of the database during the data loading period, and avoids the influence on the online use of the database due to the data loading.
And secondly, the device supports the self-defined built-in function, compared with the traditional database loading mode, the device uses the computing resource of the database to load, and transfers the computing resource to the local cloud server resource which is easy to expand, so that the shared resource of the database is greatly saved, and the stability of the database is ensured.
Third, in the traditional database loading mode, the loading of the file may be performed after the database client is installed locally, which is contrary to the concept that the mirroring is started and used when the server is in the cloud in the distributed platform and the mirroring resource is required to be small. The device is introduced in the Jar packet mode, does not need to additionally introduce a client, and is more convenient to use.
Fourth, in the conventional database loading manner, each database system uses a loading scheme, and the scheme has isomerism. According to the scheme, the SQL script is assembled by analyzing the file, so that a set of scheme can be more conveniently used for butting a plurality of heterogeneous databases, and the use cost of data migration and data loading among different databases is reduced.
Fig. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present application, and in order to achieve the above object, as shown in fig. 4, according to another aspect of the present application, there is provided a data processing apparatus including an acquisition module 42, a clipping module 44, a processing module 46, and a storage module 48, which will be described in detail below.
The device comprises an acquisition module 42, a clipping module 44, a processing module 46 and a storage module 48, wherein the acquisition module 42 is used for acquiring a file to be loaded and a configuration file, the configuration file comprises a file name of the file to be loaded and a table name of a database table, a plurality of table fields in the database table respectively correspond to file positions of the file to be loaded, the clipping module 44 is connected with the acquisition module 42 and is used for clipping the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the plurality of table fields, the processing module 46 is connected with the clipping module 44 and is used for processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and the storage module 48 is connected with the processing module 46 and is used for storing the processed file fields into a corresponding database.
Through the device, the file to be loaded and the configuration file are acquired by the acquisition module 42, wherein the configuration file comprises the file name of the file to be loaded and the table name of the database table, a plurality of table fields in the database table respectively correspond to the file positions of the file to be loaded, the clipping module 44 clips the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the plurality of table fields, the processing module 46 processes the file fields corresponding to the table fields according to the processing function corresponding to the table fields, and the storage module 48 stores the processed file fields in the corresponding database. The relation between the file to be loaded and the table field of the database table is configured through the configuration file, the file to be loaded is loaded into the database after being processed according to the processing function corresponding to the table field, and the problems of poor stability and low efficiency of the distributed system in the related technology when a large amount of data are loaded are solved. And the effects of stably and rapidly loading a large number of data files, having small fluctuation of the performance of the database and improving the stability and loading efficiency of the distributed system in the loading of the large number of data are achieved.
The data processing device comprises a processor and a memory, the acquisition module 42, the clipping module 44, the processing module 46, the storage module 48 and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel may set one or more kernel parameters to detect whether the target image has a target object through the target object detection model, and in the case of having the target object, identify whether the target object pose is abnormal.
The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.
An embodiment of the present invention provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the data processing method.
The embodiment of the invention provides a processor which is used for running a program, wherein the data processing method is executed when the program runs.
Fig. 5 is a schematic diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 5, an electronic device 50 is provided, where the electronic device 50 includes a processor, a memory, and a program stored on the memory and executable on the processor, and the processor implements the following steps when executing the program:
the method comprises the steps of obtaining a file to be loaded and a configuration file, wherein the configuration file comprises a file name of the file to be loaded and a table name of a database table, a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the table fields, processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and storing the processed file fields in a corresponding database.
Optionally, before cutting the file to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields, the method further comprises the steps of obtaining a file list of the file to be loaded, analyzing the configuration file, determining the file name of the file to be loaded and the table name of a database, wherein the table fields in the database table correspond to the file positions of the file to be loaded respectively, traversing the file list, and checking the file to be loaded.
Optionally, traversing the file list and checking the files to be loaded includes traversing the file list and reading the length of the file row of the files to be loaded, determining the length of the file row corresponding to the table field of the database table according to the configuration file, executing the steps of cutting the files to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields when the length of the read file row is consistent with the length of the file row corresponding to the database table, skipping the file row of the files to be loaded when the length of the read file row is inconsistent with the length of the file row corresponding to the database table, checking the subsequent file row, and logging the skipped file row.
The method comprises the steps of selecting a file name of a file to be loaded and a database table name, wherein a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting content data in the file positions of the file to be loaded according to the file positions corresponding to the table fields, setting up corresponding relations with table fields by taking the cut content data as the file fields when the cut content data is not empty, storing the cut content data, setting up corresponding relations with the table fields by taking preset default content data as the file fields when the cut content data is empty, and storing the cut content data.
Optionally, processing the file field corresponding to the table field according to the processing function corresponding to the table field includes determining the processing function corresponding to the table field in the corresponding relation according to the corresponding relation, wherein the processing function corresponding to the table field is preset in the configuration file, and processing the file field corresponding to the corresponding relation through the processing function.
Optionally, storing the processed file field in a corresponding database includes processing the file field locally through the built-in function if the processing function is a built-in function, and transmitting the file field to the database if the processing function is a non-built-in function, wherein the database processes the file field through the non-built-in function before loading the file field.
Optionally, storing the processed file fields in the corresponding databases includes determining a loading script corresponding to the databases according to the processed file fields, and storing the file fields in the databases by running the loading script.
Optionally, the plurality of databases are arranged, and before the file fields corresponding to the table fields are processed according to the processing functions corresponding to the table fields, the method further comprises the steps of determining the database identifications stored in the file fields according to the identification fields in the file to be loaded, wherein the identification fields are used for identifying the databases corresponding to the file fields in the file to be loaded, and determining the database identifications stored in the file fields according to the identification fields and the file fields identified by the identification fields.
Optionally, before the processed file fields are stored in the corresponding databases, the method further comprises recording the number of file lines of the files to be loaded after processing, and storing the processed file fields in the files to be loaded in the corresponding databases when the number of file lines reaches the preset number.
The device herein may be a server, PC, PAD, cell phone, etc.
The application also provides a computer program product which is suitable for executing a program initialized with the following method steps when being executed on data processing equipment, wherein the program is used for acquiring a file to be loaded and a configuration file, the configuration file comprises the file name of the file to be loaded and the table name of a database table, a plurality of table fields in the database table correspond to the file positions of the file to be loaded respectively, cutting the file to be loaded according to the configuration file, determining a plurality of file fields corresponding to the plurality of table fields, processing the file fields corresponding to the table fields according to a processing function corresponding to the table fields, and storing the processed file fields into a corresponding database.
Optionally, before cutting the file to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields, the method further comprises the steps of obtaining a file list of the file to be loaded, analyzing the configuration file, determining the file name of the file to be loaded and the table name of a database, wherein the table fields in the database table correspond to the file positions of the file to be loaded respectively, traversing the file list, and checking the file to be loaded.
Optionally, traversing the file list and checking the files to be loaded includes traversing the file list and reading the length of the file row of the files to be loaded, determining the length of the file row corresponding to the table field of the database table according to the configuration file, executing the steps of cutting the files to be loaded according to the configuration file and determining a plurality of file fields corresponding to a plurality of table fields when the length of the read file row is consistent with the length of the file row corresponding to the database table, skipping the file row of the files to be loaded when the length of the read file row is inconsistent with the length of the file row corresponding to the database table, checking the subsequent file row, and logging the skipped file row.
The method comprises the steps of selecting a file name of a file to be loaded and a database table name, wherein a plurality of table fields in the database table correspond to file positions of the file to be loaded respectively, cutting content data in the file positions of the file to be loaded according to the file positions corresponding to the table fields, setting up corresponding relations with table fields by taking the cut content data as the file fields when the cut content data is not empty, storing the cut content data, setting up corresponding relations with the table fields by taking preset default content data as the file fields when the cut content data is empty, and storing the cut content data.
Optionally, processing the file field corresponding to the table field according to the processing function corresponding to the table field includes determining the processing function corresponding to the table field in the corresponding relation according to the corresponding relation, wherein the processing function corresponding to the table field is preset in the configuration file, and processing the file field corresponding to the corresponding relation through the processing function.
Optionally, storing the processed file field in a corresponding database includes processing the file field locally through the built-in function if the processing function is a built-in function, and transmitting the file field to the database if the processing function is a non-built-in function, wherein the database processes the file field through the non-built-in function before loading the file field.
Optionally, storing the processed file fields in the corresponding databases includes determining a loading script corresponding to the databases according to the processed file fields, and storing the file fields in the databases by running the loading script.
Optionally, the plurality of databases are arranged, and before the file fields corresponding to the table fields are processed according to the processing functions corresponding to the table fields, the method further comprises the steps of determining the database identifications stored in the file fields according to the identification fields in the file to be loaded, wherein the identification fields are used for identifying the databases corresponding to the file fields in the file to be loaded, and determining the database identifications stored in the file fields according to the identification fields and the file fields identified by the identification fields.
Optionally, before the processed file fields are stored in the corresponding databases, the method further comprises recording the number of file lines of the files to be loaded after processing, and storing the processed file fields in the files to be loaded in the corresponding databases when the number of file lines reaches the preset number.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of a computer-readable medium.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.
Claims (10)
1.A method of data processing, comprising:
acquiring a file to be loaded and a configuration file, wherein the configuration file comprises a file name of the file to be loaded and a table name of a database table, and a plurality of table fields in the database table correspond to the file positions of the file to be loaded respectively;
Cutting the file to be loaded according to the configuration file, and determining a plurality of file fields corresponding to the plurality of table fields;
processing the file field corresponding to the table field according to the processing function corresponding to the table field;
storing the processed file fields into corresponding databases;
Wherein the file to be loaded is cut according to the configuration file, before determining the plurality of file fields corresponding to the plurality of table fields, the method further comprises:
Acquiring a file list of the files to be loaded, wherein the file list comprises file names of a plurality of files to be loaded;
analyzing the configuration file, determining the file name of the file to be loaded and the table name of a database, wherein a plurality of table fields in the database table respectively correspond to the file positions of the file to be loaded;
Traversing the file list, and checking a plurality of files to be loaded;
the file name of the file to be loaded and the table name of the database, wherein a plurality of table fields in the database table respectively correspond to the file position of the file to be loaded, and the method comprises the following steps:
cutting the content data in the file position in the file to be loaded according to the file position corresponding to the table field;
under the condition that the cut content data is not empty, the cut content data is used as the file field, and a corresponding relation is established with the table field and is stored;
And under the condition that the cut content data is empty, taking the preset default content data as the file field, establishing a corresponding relation with the table field, and storing.
2. The method of claim 1, wherein traversing the file manifest and checking a plurality of the files to be loaded comprises:
Traversing the file list, and reading the length of a file row of a file to be loaded;
determining the length of a file row corresponding to a table field of the database table according to the configuration file;
Executing the step of cutting the file to be loaded according to the configuration file to determine a plurality of file fields corresponding to the plurality of table fields under the condition that the length of the read file row is consistent with the length of the file row corresponding to the database table;
And under the condition that the length of the read file row is inconsistent with the length of the file row corresponding to the database table, skipping the file row of the file to be loaded, checking the subsequent file row, and logging the skipped file row.
3. The method of claim 1, wherein processing the file field corresponding to the table field according to the processing function corresponding to the table field comprises:
Determining a processing function corresponding to a table field in the corresponding relation according to the corresponding relation, wherein the processing function corresponding to the table field is preset in the configuration file;
and processing the file fields corresponding to the corresponding relation through the processing function.
4. A method according to claim 3, wherein storing the processed file fields in the corresponding database comprises:
Under the condition that the processing function is a built-in function, the file field is processed locally through the built-in function;
and if the processing function is a non-built-in function, the file field is sent to the database, wherein the database processes the file field through the non-built-in function before loading the file field.
5. The method of claim 1, wherein storing the processed file fields in the corresponding database comprises:
Determining a loading script corresponding to the database according to the processed file field;
And storing the file field into the database by running the loading script.
6. The method of claim 5, wherein the database is a plurality of databases, and wherein before processing the file field corresponding to the table field according to the processing function corresponding to the table field, the method further comprises:
according to the identification field in the file to be loaded, wherein the identification field is used for identifying a database corresponding to and stored in the file field in the file to be loaded;
and determining the database identification stored in the file field according to the identification field and the file field identified by the identification field.
7. The method according to any one of claims 1 to 6, wherein before storing the processed file fields in the corresponding database, the method further comprises:
Recording the number of file lines of the file to be loaded after the processing is completed;
And storing the file fields processed in the file to be loaded into a corresponding database under the condition that the number of the file rows reaches the preset number.
8. A data processing apparatus, comprising:
the device comprises an acquisition module, a configuration file and a loading module, wherein the acquisition module is used for acquiring a file to be loaded and the configuration file, the configuration file comprises a file name of the file to be loaded and a table name of a database table, and a plurality of table fields in the database table respectively correspond to the file positions of the file to be loaded;
The clipping module is used for clipping the file to be loaded according to the configuration file and determining a plurality of file fields corresponding to the plurality of table fields;
The processing module is used for processing the file field corresponding to the table field according to the processing function corresponding to the table field;
a storage module for storing the processed file fields in a corresponding database,
The device is further configured to cut the file to be loaded according to the configuration file, and obtain a file list of the file to be loaded before determining a plurality of file fields corresponding to the plurality of table fields, where the file list includes file names of the plurality of files to be loaded; analyzing the configuration file, determining the file name of the file to be loaded and the table name of a database, wherein a plurality of table fields in the database table respectively correspond to the file positions of the file to be loaded;
The device is also used for clipping the content data in the file position in the file to be loaded according to the file position corresponding to the table field, setting up a corresponding relation with the table field by taking the clipped content data as the file field and storing the file field when the clipped content data is not empty, setting up a corresponding relation with the table field by taking preset default content data as the file field when the clipped content data is empty and storing the file field.
9. A processor for running a program, wherein the program when run performs the data processing method of any one of claims 1 to 7.
10. An electronic device comprising one or more processors and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data processing method of any of claims 1-7.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210017883.1A CN114372027B (en) | 2022-01-07 | 2022-01-07 | Data processing method, device, processor and electronic device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210017883.1A CN114372027B (en) | 2022-01-07 | 2022-01-07 | Data processing method, device, processor and electronic device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114372027A CN114372027A (en) | 2022-04-19 |
| CN114372027B true CN114372027B (en) | 2025-03-25 |
Family
ID=81143254
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210017883.1A Active CN114372027B (en) | 2022-01-07 | 2022-01-07 | Data processing method, device, processor and electronic device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114372027B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115481180A (en) * | 2022-09-14 | 2022-12-16 | 上海浦东发展银行股份有限公司 | Data loading method and device and computer equipment |
| CN116775613A (en) * | 2023-06-28 | 2023-09-19 | 中国建设银行股份有限公司 | Data migration method, device, electronic equipment and computer readable medium |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
| CN111339041A (en) * | 2020-03-10 | 2020-06-26 | 中国建设银行股份有限公司 | File parsing and warehousing and file generating method and device |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109947759A (en) * | 2017-07-17 | 2019-06-28 | 中国移动通信集团吉林有限公司 | A data index establishment method, index retrieval method and device |
| CN113254262B (en) * | 2020-02-13 | 2023-09-05 | 中国移动通信集团广东有限公司 | Database disaster recovery method and device and electronic equipment |
| CN111352838B (en) * | 2020-02-28 | 2025-08-19 | 中国平安人寿保险股份有限公司 | Method and device for generating package file and electronic equipment |
-
2022
- 2022-01-07 CN CN202210017883.1A patent/CN114372027B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
| CN111339041A (en) * | 2020-03-10 | 2020-06-26 | 中国建设银行股份有限公司 | File parsing and warehousing and file generating method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114372027A (en) | 2022-04-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9535754B1 (en) | Dynamic provisioning of computing resources | |
| CN110019298B (en) | Data processing method and device | |
| CN110515795B (en) | Big data component monitoring method and device and electronic equipment | |
| CN114372027B (en) | Data processing method, device, processor and electronic device | |
| CN106294113B (en) | creation method and device based on programmable test service | |
| CN106411970B (en) | A fault handling method, device and system based on service invocation | |
| CN112559444B (en) | SQL file migration method, device, storage medium and equipment | |
| CN112527792A (en) | Data storage method, device, equipment and storage medium | |
| CN111240987B (en) | Method and device for detecting migration program, electronic equipment and computer readable storage medium | |
| CN115525545A (en) | Docker-based automatic testing method, system, equipment and medium | |
| CN110019497B (en) | Data reading method and device | |
| CN117076473B (en) | Metadata operation method, system, equipment and medium for SaaS multi-tenant | |
| US10372513B2 (en) | Classification of application events using call stacks | |
| CN111435327B (en) | Log record processing method, device and system | |
| CN114138646B (en) | CAD resource verification method and device, storage medium and processor | |
| CN116467010A (en) | Method and device for calling host resource by plug-in, processor and electronic equipment | |
| CN116841653A (en) | Execution method and device of operation and maintenance job, processor and electronic equipment | |
| CN116594734A (en) | Container migration method and device, storage medium and electronic equipment | |
| CN115309477A (en) | Plug-in data processing method, system and readable storage medium for Apache Spark component | |
| CN111400245B (en) | Art resource migration method and device | |
| CN116048615B (en) | Distributed program slicing method, device and equipment based on natural language processing | |
| CN114629788B (en) | Configuration information updating method, system, storage medium and electronic device | |
| CN112579189A (en) | Configuration file updating method and device | |
| CN113835709B (en) | Information analysis method, device, storage medium and processor | |
| CN117033674A (en) | Picture storage method and device, storage medium and electronic equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |