CN117112184B

CN117112184B - Task scheduling service method and system based on container technology

Info

Publication number: CN117112184B
Application number: CN202311369652.8A
Authority: CN
Inventors: 许靖; 柴磊; 韦晓
Original assignee: Shenzhen Magic Digital Intelligent Artificial Intelligence Co ltd
Current assignee: Shenzhen Magic Digital Intelligent Artificial Intelligence Co ltd
Priority date: 2023-10-23
Filing date: 2023-10-23
Publication date: 2024-02-02
Anticipated expiration: 2043-10-23
Also published as: CN117112184A

Abstract

The invention provides a task scheduling service method and system based on container technology. The method includes: obtaining data source connection information, using the connection information to communicate, and calling up the corresponding code and monitoring status; configuring the data source to obtain the code Host templates; create new containers through basic images, configure Python package index sources; create and manage dedicated virtual environments at the same time; when a task is called up, push the execution request to the task scheduling service, and based on the parameter information of the pushed task , calculate the resources and file resources of the corresponding system, add a task queue to obtain the virtual environment corresponding to the task, and save the scheduling parameters; the system includes: information processing module, environment creation module and task processing module. The containerized deployment method of the present invention supports rapid deployment and distributed deployment. When task execution services are deployed in the form of clusters, the execution efficiency will be greatly improved.

Description

Task scheduling service method and system based on container technology

Technical Field

The invention relates to the technical field of data processing, in particular to a task scheduling service method and system based on a container technology.

Background

Along with the carding of index systems in different industries, the technology used in the method is also various in different index systems, and the technology is considered according to the requirements of a target database stored by data, performance and other reasons. In constructing an index system, when different computing engines and databases need to be supported, a single system cannot satisfy diversity support and full satisfaction in resources. However, in the prior art, the data processing codes stored by users cannot be directly issued into the service of timing scheduling execution, and an external platform is needed, so that the owned codes comprise sql, python, spark codes and the like, and the system requirement for the execution characteristics of the compatible codes is higher, and the system is required to support the scheduling logic of various codes and support the timing scheduling and monitoring; the existing system does not carry out environment stripping under the general condition, codes are extremely easy to invade the running environment of the system when being called up, the codes are staggered with the system environment, and when the codes which are not standardized are possibly influenced by the running of the original system in the executing process, pollute the host environment, cause breakdown and cannot meet the requirements on the safety of the system under the general condition; the environment configuration of the user level can not be provided in the face of environments required by different codes, so that the use experience of the user can be greatly influenced; the problem that the dependent package version is incompatible when the stored data processing codes such as python are called up for code execution, or the execution result is different due to inconsistent package versions, the execution environment is seriously coupled with the system environment, the code is very difficult to access to the production without modification, and currently, the department in the industry is the execution environment which is manually integrated with the code or rearranged and adapted in operation and maintenance, so that the time and the labor are wasted, and the requirements of the current fast adaptation market are not met; task scheduling services which are not independently stripped cannot be quickly accessed to other platforms, are often limited to be used in specific platforms, and the fixed parameters and the fixed calling modes of the part often determine the upper limit of scheduling execution services, so that the lower the degree of external opening is, the more adverse is to componentization and microservization, and the openness of the whole platform can be influenced; the resource problem is that the running number of the tasks executed by the current platform is based on the resources owned by the current local system, so that when a plurality of people execute the tasks simultaneously, the tasks are easy to block in resource application, the overall operation of the platform can be influenced, and the normal use requirement of the platform is not facilitated.

First, application number: CN201310342752.1 discloses a task scheduling service system and method, comprising: the task calling end module is used for initiating a task scheduling request; the service interface component module is used for creating tasks according to task scheduling requests and creating corresponding scheduling task records in the database; the task scanning component module is used for scanning the scheduling task records, calculating task priorities according to a task scheduling priority algorithm, and then placing the tasks into corresponding priority queues; the task queue component module is used for selecting a task to be executed currently in the priority queue according to a priority queue element dequeuing algorithm; and the task execution module is used for executing the task to be executed currently. By the task scheduling service system and the task scheduling service method, the task scheduling is prioritized, the execution efficiency of the task scheduling service is improved, and meanwhile, the differentiated experience of users on task scheduling services with different priorities is enhanced.

Second prior art, application number: CN201710287419.3 discloses a task scheduling server and task scheduling method, comprising: the device comprises a judging module, an acquiring module and a processing module. And the judging module is used for judging whether the current sub-task is singly executed by the corresponding sub-system. The acquisition module is used for acquiring a task script corresponding to the subtask when the current subtask is singly executed by the corresponding subsystem; the processing module is used for executing the task script, sending the execution request of the subtask to the corresponding subsystem, monitoring the execution result of the subtask, and determining whether the subtask is executed according to the execution result. The task script of the subtask can be called to determine whether the current subtask is successfully executed or not, so that all the subtasks can be orderly executed.

Third, application number: CN201710209051.9 discloses a quantiz-based timing task scheduling service framework and method, including, configuration files including configuration information; the task scheduler comprises a trigger and a job interface, and instantiates the trigger and the job interface through configuration information to provide corresponding task scheduling service; and the service task end is configured with a service task program of the inherited operation interface, and receives a trigger signal sent by the trigger after the task scheduling service is started so that the service task program can complete corresponding operation. Although the requirement of the timing task scheduling service can be simply and rapidly met, the development efficiency and quality of the business system are improved.

The first, second and third existing technologies can not be directly issued into the service of timing scheduling execution, can not provide the environment configuration of user level, the execution environment is seriously coupled with the system environment, the task scheduling service is not independently stripped, and the problem that a plurality of persons execute tasks simultaneously is easy to cause blocking is solved.

Disclosure of Invention

In order to solve the technical problems, the invention provides a task scheduling service method based on a container technology, which comprises the following steps:

acquiring data source connection information, communicating by using the connection information, and calling up a corresponding code and a monitoring state; configuring a data source to obtain a managed template of a code;

creating a new container through the basic mirror image, and configuring a Python package index source; creating and managing exclusive virtual environment;

after a task is called up, pushing an execution request to a task scheduling service, calculating resources and file resources of a corresponding system according to parameter information of the pushed task, adding a task queue to obtain a virtual environment corresponding to the task, and storing the scheduled parameters.

Optionally, the process of obtaining the managed template of the code includes the steps of:

the data source is configured in type and connection format, whether the data source is configured correctly is judged, if the data source is configured correctly, the data source is initialized, the data source connection information is read, the data source management interface is imported and the connection authority is checked, and if the data source is configured incorrectly, operators are informed that the data source is configured incorrectly;

After configuration is finished, a data processing engine template is established, codes, input parameters, output parameters and basic information of the data processing engine are configured, on-line coding of managed codes is carried out according to the codes, coding characteristics of the managed codes are obtained, and a coding style function corresponding to a target coding mode is selected from a preset coding function library based on the coding characteristics;

compiling the coding style function to obtain a managed coding function, and uniformly preprocessing codes to generate function call information corresponding to the managed coding function; utilizing the managed coding function to carry out managed compiling based on the function call information to obtain an operation information compiling file of the data processing engine; and compiling the file according to the running information to generate the application software of the data processing engine.

Optionally, the process of creating a new container by the base image comprises the steps of:

acquiring environment information of a basic image, creating and starting a new container through the basic image, wherein the environment information comprises a basic environment, a language, a computing package and a connection package;

constructing a basic mirror image through a mirror image container management tool, storing the basic mirror image in a mirror image warehouse, and creating a container group to obtain a new container; dispatching to a specific container management service according to the created container group, and calling a server by the container management service;

The container management service configures a Python packet index source which supports downloading and installing a new Python packet; by mirroring the container management tool page add-on and creating a personal virtual environment, the personal virtual environment creates a virtual environment in real time in the new container, the virtual environment creates a separate Python environment containing the package required for the Python interpreter and project.

Optionally, the creating process of the virtual environment includes the following steps:

receiving a request for creating a virtual environment, selecting a basic mirror image according to the environment information of the virtual environment, and creating a new container according to the basic mirror image;

analyzing the request to obtain the number of created virtual environments, dividing the new container into a first basic container and a second basic container, and constructing the basic environment by using the second basic container to obtain a Python interpreter and a dependency package;

and constructing a virtual environment and a basic environment by using the first basic container, configuring a Python interpreter in the virtual environment, and creating the virtual environment according to the configured environment information and the dependency package.

Optionally, a process of adding a task queue to obtain a virtual environment corresponding to a task and storing a scheduled parameter includes the following steps:

receiving a task calling request, and pushing task information into a message queue RabbitMQ;

Obtaining a configuration instruction of a message queue RabbitMQ, constructing a RabbitMQ management center according to the configuration instruction, obtaining path configuration information pushed by task information of the queue, monitoring the queue according to the path configuration information, and obtaining a monitoring result by a task scheduling service; after acquiring task information, calculating required system resources and file resources including CPU core number, memory consumption and hard disk resources according to parameters, and adding tasks into a task queue;

the task queue follows the principle of first-in first-out, and waits to be executed according to the sequence; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; and calling different execution classes to execute the task according to the parameters.

Optionally, a process of calling different execution classes to execute tasks includes the following steps:

activating a virtual environment, wherein the virtual environment is remotely created through a virtual environment management interface, and the virtual environment is preconfigured and is used for executing specific tasks or programs; dynamically loading an execution code, reading the execution code according to a dynamic loading instruction, dynamically loading or updating the execution code through a configuration port, reconfiguring programmable resources in the execution code, and instantiating the required execution code into an executable class;

Calling a specific method of the executable class by using the pre-configured parameters to execute tasks;

after a return result of task execution is obtained, processing is carried out according to requirements; the result is returned to the calling party through the HTTP mode, or the data is stored to the temporary table through the built-in data table falling function, and the calling party is informed of the stored information through the HTTP mode.

The invention provides a task scheduling service system based on a container technology, which comprises the following components:

the information processing module is in charge of acquiring data source connection information, communicating by using the connection information, and calling up a corresponding code and a monitoring state; configuring a data source to obtain a managed template of a code;

the environment creation module is responsible for creating a new container through the basic mirror image and configuring a Python package index source; creating and managing exclusive virtual environment;

and the task processing module is responsible for pushing an execution request to a task scheduling service after a task is scheduled, calculating resources and file resources of a corresponding system according to the parameter information of the pushed task, adding a task queue to acquire a virtual environment corresponding to the task, and storing the scheduled parameters.

Optionally, the environment creation module includes:

the container creation sub-module is responsible for acquiring environment information of the basic mirror image, creating and starting a new container through the basic mirror image, wherein the environment information comprises a basic environment, a language, a calculation package and a connection package;

the container scheduling sub-module is responsible for constructing a basic mirror image through a mirror image container management tool, storing the basic mirror image in a mirror image warehouse, and creating a container group to obtain a new container; dispatching to a specific container management service according to the created container group, and calling a server by the container management service;

the service configuration sub-module is in charge of the container management service to configure a Python packet index source, and the Python packet index source supports downloading and installing a new Python packet; by mirroring the container management tool page add-on and creating a personal virtual environment, the personal virtual environment creates a virtual environment in real time in the new container, the virtual environment creates a separate Python environment containing the package required for the Python interpreter and project.

Optionally, the task processing module includes:

the information pushing sub-module is in charge of receiving a task calling request and pushing task information into the message queue RabbitMQ;

the instruction processing sub-module is responsible for obtaining configuration instructions of the message queue RabbitMQ, constructing a RabbitMQ management center according to the configuration instructions, obtaining path configuration information pushed by task information of the queue, monitoring the queue according to the path configuration information, and obtaining a monitoring result by the task scheduling service; after acquiring task information, calculating required system resources and file resources including CPU core number, memory consumption and hard disk resources according to parameters, and adding tasks into a task queue;

The task execution sub-module is responsible for queuing and waiting to be executed according to the sequence, wherein the task queue follows the principle of first-in first-out; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; and calling different execution classes to execute the task according to the parameters.

Firstly, acquiring data source connection information, communicating by using the connection information, and calling up a corresponding code and a monitoring state; configuring a data source to obtain a managed template of a code; secondly, creating a new container through the basic mirror image, and configuring a Python package index source; creating and managing exclusive virtual environment; finally, after the task is called up, pushing an execution request to a task scheduling service, calculating the conditions of corresponding system resources, file resources and the like according to the parameter information of the pushed task, adding a task queue to obtain a virtual environment corresponding to the task, and storing the scheduled parameter; the task scheduling execution service is adopted, the cooling convenience of the container technology is utilized in the technical field of data processing, the integration of multiple virtual environments is realized, and the stronger execution efficiency is provided; the expansion of multiple environments is realized, so that different users of the same platform have more choices, and the users can completely define respective virtual environments independently to execute different codes, thereby enabling sql, python, spark codes to be integrated rapidly in the same system; the diversity of functions greatly improves the corresponding code diversity, can also be rapidly supported on the subsequent expansion of other execution code classes, and can be rapidly invoked by a task mode by only installing different packages to create independent virtual environments; the effectiveness of environment isolation, because the code execution layer is completely stripped out and is completely submitted to the task scheduling service center for execution, the coupling with the original system is completely isolated, the stability and the safety of the system are greatly improved, the system problems caused in the process of executing task codes, such as os operation of python codes, are avoided, the breakdown of the system caused by incorrect operation is avoided, and compared with the system of an unserviceable execution component at present, the system is incomparable in the aspects of execution efficiency, safety, diversity and the like; the containerized deployment mode supports quick deployment and distributed deployment, when task execution services are deployed in a cluster mode, the execution efficiency is greatly improved, and compared with the prior art that new environments are reconfigured and redeployed manually according to environments required by users, the system has the characteristics of simplicity and rapidness in efficiency and simplicity; even code developers lacking related container learning background can rapidly deploy the required environment, so that the learning cost of the developers and the time cost consumed by deployment are saved to the maximum extent; the device is provided with various data source adaptation components, supports spark, hive, mysql and the like, and can perform the operations of adding, deleting and modifying various data sources without depending on other components; unlike prior art solutions, the coupled databases are connected into the system and depending on the particular circumstances, the user's existing database cannot be directly used as a data source. The invention integrates the data source adaptation and the code scheduling adaptation, so that the old index output codes of the former developers can be invoked and scheduled without modification, and the modification cost and time are saved.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a flow chart of a task scheduling service method based on container technology in embodiment 1 of the present invention;

FIG. 2 is a process diagram of a managed template of the code obtained in embodiment 2 of the present invention;

FIG. 3 is a diagram of a process for creating a new container by a base image in embodiment 3 of the present invention;

FIG. 4 is a diagram illustrating a process of creating a virtual environment according to embodiment 4 of the present invention;

FIG. 5 is a schematic diagram of a virtual environment creation process in embodiment 4 of the present invention;

FIG. 6 is a process diagram of adding a task queue to obtain a virtual environment corresponding to a task and storing scheduled parameters in embodiment 5 of the present invention;

FIG. 7 is a schematic diagram of a process of adding a task queue to obtain a virtual environment corresponding to a task and storing scheduled parameters in embodiment 5 of the present invention;

FIG. 8 is a process diagram of invoking different execution classes for task execution in embodiment 6 of the present invention;

FIG. 9 is a block diagram of a task scheduling service system based on container technology in embodiment 7 of the present invention;

FIG. 10 is a block diagram of an information processing module in embodiment 8 of the present invention;

FIG. 11 is a block diagram of an environment creation module in embodiment 9 of the present invention;

fig. 12 is a block diagram of a task processing module in embodiment 10 of the present invention.

Detailed Description

The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.

The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the application. As used in the examples and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims. The specific meaning of the terms in this application will be understood by those of ordinary skill in the art as the case may be.

Example 1: as shown in fig. 1, an embodiment of the present invention provides a task scheduling service method based on container technology, which includes the following steps:

s100: acquiring data source connection information, communicating by using the connection information, and calling up a corresponding code and a monitoring state; configuring a data source to obtain a managed template of a code;

s200: creating a new container through the basic mirror image, and configuring a Python package index source; creating and managing exclusive virtual environment;

s300: after a task is called up, pushing an execution request to a task scheduling service, calculating the conditions of resources, file resources and the like of a corresponding system according to the parameter information of the pushed task, adding a task queue to obtain a virtual environment corresponding to the task, and storing the scheduled parameter;

The working principle and beneficial effects of the technical scheme are as follows: firstly, acquiring data source connection information, communicating by using the connection information, and calling up a corresponding code and a monitoring state; configuring a data source to obtain a managed template of a code; secondly, creating a new container through the basic mirror image, and configuring a Python package index source; creating and managing exclusive virtual environment; finally, after the task is called up, pushing an execution request to a task scheduling service, calculating the conditions of corresponding system resources, file resources and the like according to the parameter information of the pushed task, adding a task queue to obtain a virtual environment corresponding to the task, and storing the scheduled parameter; the scheme adopts task scheduling execution service, and the cooling convenience of the container technology is utilized in the technical field of data processing, so that the integration of multiple virtual environments is realized, and stronger execution efficiency is provided; the expansion of multiple environments is realized, so that different users of the same platform have more choices, and the users can completely define respective virtual environments independently to execute different codes, thereby enabling sql, python, spark codes to be integrated rapidly in the same system; the diversity of functions greatly improves the corresponding code diversity, can also be rapidly supported on the subsequent expansion of other execution code classes, and can be rapidly invoked by a task mode by only installing different packages to create independent virtual environments; the effectiveness of environment isolation, because the code execution layer is completely stripped out and is completely submitted to the task scheduling service center for execution, the coupling with the original system is completely isolated, the stability and the safety of the system are greatly improved, the system problems caused in the process of executing task codes, such as os operation of python codes, are avoided, the breakdown of the system caused by incorrect operation is avoided, and compared with the system of an unserviceable execution component at present, the implementation efficiency, the safety, the diversity and the like of the embodiment are incomparable; the containerized deployment mode supports quick deployment and distributed deployment, when task execution services are deployed in a cluster mode, the execution efficiency is greatly improved, and compared with the prior art that new environments are reconfigured and redeployed manually according to environments required by users, the system has the characteristics of simplicity and rapidness in efficiency and simplicity; even code developers lacking related container learning background can rapidly deploy the required environment, so that the learning cost of the developers and the time cost consumed by deployment are saved to the maximum extent; the device is provided with various data source adaptation components, supports spark, hive, mysql and the like, and can perform the operations of adding, deleting and modifying various data sources without depending on other components; unlike prior art solutions, the coupled databases are connected into the system and depending on the particular circumstances, the user's existing database cannot be directly used as a data source. The data source adaptation and the code scheduling adaptation integrated in the embodiment enable old index output codes of the former developers to be invoked and scheduled without modification, and save modification cost and time.

Example 2: as shown in fig. 2, on the basis of embodiment 1, the process of obtaining the managed template of the code provided by the embodiment of the invention includes the following steps:

s104: the data source is configured in type and connection format, whether the data source is configured correctly is judged, if the data source is configured correctly, the data source is initialized, the data source connection information is read, the data source management interface is imported and the connection authority is checked, and if the data source is configured incorrectly, operators are informed that the data source is configured incorrectly;

s105: after configuration is finished, a data processing engine template is established, codes, input parameters, output parameters and basic information of the data processing engine are configured, on-line coding of managed codes is carried out according to the codes, coding characteristics of the managed codes are obtained, and a coding style function corresponding to a target coding mode is selected from a preset coding function library based on the coding characteristics;

s106: compiling the coding style function to obtain a managed coding function, and uniformly preprocessing codes to generate function call information corresponding to the managed coding function; utilizing the managed coding function to carry out managed compiling based on the function call information to obtain an operation information compiling file of the data processing engine; compiling a file according to the running information to generate application software of a data processing engine;

The working principle and beneficial effects of the technical scheme are as follows: firstly, configuring a type and a connection format of a data source, judging whether the configuration of the data source is correct, initializing the data source if the configuration of the data source is correct, reading the connection information of the data source, importing a data source management interface and checking the connection authority, and informing an operator that the configuration of the data source is incorrect if the configuration of the data source is incorrect; after the configuration is finished, a data processing engine template is established, codes, input parameters, output parameters and basic information of the data processing engine are configured, on-line coding of managed codes is carried out according to the codes, coding characteristics of the managed codes are obtained, and a coding style function corresponding to a target coding mode is selected from a preset coding function library based on the coding characteristics; finally compiling the coding style function to obtain a managed coding function, and uniformly preprocessing codes to generate function call information corresponding to the managed coding function; utilizing the managed coding function to carry out managed compiling based on the function call information to obtain an operation information compiling file of the data processing engine; compiling a file according to the running information to generate application software of a data processing engine; the scheme realizes the configuration and connection management of the data sources and the template configuration and code generation of the data processing engine; the correctness of the data source is ensured by configuring the data source, and the authority verification is carried out to ensure the reliability and the safety of the data; meanwhile, a data processing engine template is established and online coding is carried out, and a corresponding coding style function is generated according to requirements, so that flexible data processing operation is realized; and finally, the generated data processing engine application software improves the efficiency and accuracy of data processing and provides better data analysis and decision support. The embodiment provides a complete set of data processing flow and tool, which helps users to better manage and utilize data resources.

In the embodiment, the connection information of spark is imported through a data source management page of the page, connection authority verification is carried out, after the configuration of the data source is completed, a spark template of a platform menu is newly established, and the spark code, input parameters and output content level basic information of the template are configured, so that the hosting of the code on the platform is completed; after the template is possessed, the template is imported through the expert model of the page, tasks are arranged in a dragging mode, and task scheduling is carried out for the following process.

Example 3: as shown in fig. 3, on the basis of embodiment 1, the process of creating a new container by the base mirror image provided in the embodiment of the present invention includes the following steps:

s201: acquiring environment information of a basic image, creating and starting a new container through the basic image, wherein the environment information comprises a basic environment, a language, a computing package, a connection package and the like;

s202: constructing a basic mirror image through a mirror image container management tool, storing the basic mirror image in a mirror image warehouse, and creating a container group to obtain a new container; dispatching to a specific container management service according to the created container group, and calling a server by the container management service;

s203: the container management service configures a Python packet index source which supports downloading and installing a new Python packet; adding through a mirror image container management tool page, and creating a personal virtual environment, wherein the personal virtual environment generates a virtual environment in a new container in real time, and the virtual environment creates an independent Python environment which contains a Python interpreter of a specific version and a package required by an item;

The working principle and beneficial effects of the technical scheme are as follows: the method comprises the steps that firstly, environment information of a basic image is obtained, a new container is created and opened through the basic image, and the environment information comprises a basic environment, a language, a calculation package, a connection package and the like; secondly, constructing a basic mirror image through a mirror image container management tool, storing the basic mirror image into a mirror image warehouse, and creating a container group to obtain a new container; dispatching to a specific container management service according to the created container group, and calling a server by the container management service; finally, the container management service configures a Python packet index source which supports downloading and installing a new Python packet; adding through a mirror image container management tool page, and creating a personal virtual environment, wherein the personal virtual environment generates a virtual environment in a new container in real time, and the virtual environment creates an independent Python environment which contains a Python interpreter of a specific version and a package required by an item; by creating the personal virtual environment, the scheme can isolate the development environment of the user from the system environment, avoid the influence of the operation of the user on the system environment, and simultaneously avoid the interference of the change of the system environment on the development work of the user; the personal virtual environment allows a user to install a Python interpreter of a specific version and packages required by the items in the personal virtual environment, so that the user can conveniently manage the dependency relationship of the items, dependency conflict among different items is avoided, and each item can be ensured to normally run; through the personal virtual environment, a user can run a plurality of projects on the same machine, each project has an independent environment, and the user can rapidly switch the development environments of different projects, so that the development efficiency and flexibility are improved; the personal virtual environment can be packaged into a mirror image and stored in a mirror image warehouse, and when the project is deployed, the environment required by the project can be quickly built only by deploying the mirror image into the target environment, so that the deployment process is simplified. The embodiment creates a personal virtual environment, provides an independent and isolated development environment, is convenient for users to manage dependence, improves flexibility, simplifies the deployment process of projects, and is very significant for developers.

In this embodiment, the image container management, before service deployment, a base image of a dock needs to be prepared in advance, where the base image includes a python base environment, and a software language of python3 is built in (the service is implemented based on a python flash), and includes a common computing packet and a connection packet such as pandas, numpy and pyspark; when the new container is deployed for the first time, a new container is required to be pulled up through the basic mirror image, and all environment information is acquired; through the mirror image container management of the front-end page of the system, an internal pip source is configured, the pip source supports direct downloading and installation of new packages, and a dedicated personal virtual environment is created in a page adding mode and is completely independent of the environment of the system, and the environment is an environment only belonging to the user, and can produce the virtual environment in the container in real time.

Example 4: as shown in fig. 4, on the basis of embodiment 3, the creation process of the virtual environment provided by the embodiment of the present invention includes the following steps:

s2031: receiving a request for creating a virtual environment, selecting a basic mirror image according to the environment information of the virtual environment, and creating a new container according to the basic mirror image;

s2032: analyzing the request to obtain the number of created virtual environments, dividing the new container into a first basic container and a second basic container, and constructing the basic environment by using the second basic container to obtain a Python interpreter and a dependency package;

S2033: constructing a virtual environment and a basic environment by using a first basic container, configuring a Python interpreter in the virtual environment, and creating the virtual environment according to the configured environment information and the dependency package;

the working principle and beneficial effects of the technical scheme are as follows: the method comprises the steps of firstly receiving a request for creating a virtual environment, selecting a basic mirror image according to environment information of the virtual environment, and creating a new container according to the basic mirror image; secondly, analyzing the request to obtain the number of created virtual environments, dividing the new container into a first basic container and a second basic container, and constructing the basic environment by utilizing the second basic container to obtain a Python interpreter and a dependency package; finally, constructing a virtual environment and a basic environment by utilizing a first basic container, configuring a Python interpreter in the virtual environment, and creating the virtual environment according to the configured environment information and the dependency package (the principle is shown in figure 5); the scheme is isolated: by creating independent virtual environments, dependency conflicts among different projects are avoided, each virtual environment is provided with a Python interpreter and a dependency package, and the virtual environments are not mutually influenced; independence: each virtual environment is provided with Python interpreters and dependency packages of different versions, so that the requirements of different projects are met, and the required Python interpreters and dependency packages are obtained by dividing a basic container and constructing the basic environment; flexibility: creating a plurality of virtual environments according to the number specified in the request, wherein each environment is independently managed and maintained, and the specific requirements of the project are met by configuring environment information and a dependency package of the virtual environment; simplified management: by automating the process of creating the virtual environment, the workload of manual operations and configuration is reduced, and by sharing the base container and the base environment, the version and update of the dependency package is better managed and maintained. The embodiment provides an isolated, independent and flexible development environment, so that a developer can better manage and maintain the dependency relationship of the project, and the development efficiency and the maintainability of the project are improved.

Example 5: as shown in fig. 6, on the basis of embodiment 1, the process of adding a task queue to obtain a virtual environment corresponding to a task and storing a scheduled parameter according to the embodiment of the present invention includes the following steps:

s301: receiving a task calling request, and pushing task information into a message queue RabbitMQ;

s302: obtaining a configuration instruction of a message queue RabbitMQ, constructing a RabbitMQ management center according to the configuration instruction, obtaining path configuration information pushed by task information of the queue, monitoring the queue according to the path configuration information, and obtaining a monitoring result by a task scheduling service; after acquiring task information, calculating the conditions of required system resources, file resources and the like according to parameters, including CPU core number, memory consumption, hard disk resources and the like, and adding tasks into a task queue;

s303: the task queue follows the principle of first-in first-out, and waits to be executed according to the sequence; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; according to the parameters, different execution classes are called to execute tasks;

the working principle and beneficial effects of the technical scheme are as follows: firstly, receiving a task calling request, and pushing task information into a message queue RabbitMQ; secondly, obtaining a configuration instruction of the message queue RabbitMQ, constructing a RabbitMQ management center according to the configuration instruction, obtaining path configuration information pushed by task information of the queue, monitoring the queue according to the path configuration information, and obtaining a monitoring result by a task scheduling service; after acquiring task information, calculating the conditions of required system resources, file resources and the like according to parameters, including CPU core number, memory consumption, hard disk resources and the like, and adding tasks into a task queue; finally, the task queue follows the principle of first-in first-out, and waits to be executed according to the sequence; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; according to the parameters, different execution classes are called to execute tasks (the principle is shown in fig. 7); the scheme improves the task scheduling efficiency: the task information is pushed to the message queue, and the task information is acquired through the monitoring queue, so that asynchronous processing and concurrent execution of the task are realized, and the task scheduling efficiency is improved; flexibly configuring task scheduling: the execution mode and the path of the task are flexibly configured according to specific requirements through configuration instructions and path configuration information, so that the task scheduling is more flexible and controllable; resource management and scheduling: the system resources are reasonably managed and scheduled by calculating the conditions of required system resources, file resources and the like, including the number of CPU cores, the memory consumption, hard disk resources and the like, so that resource overload or waste is avoided, and the overall performance of the system is improved; task queue management: orderly execution of tasks is realized through a first-in first-out principle of a task queue, so that the loss or disorder of the tasks is avoided, and the accuracy and the integrity of task execution are ensured; support multiple task types and scheduling types: and according to the parameters of the task and the scheduling requirements, different execution classes are called to execute the task, and flexible application of various task types and scheduling types is supported. The embodiment provides an efficient, flexible and controllable task scheduling service, which can effectively manage and schedule tasks and improve the overall performance of a system and the accuracy of task execution.

Example 6: as shown in fig. 8, on the basis of embodiment 5, the process of calling different execution classes to execute tasks provided in the embodiment of the present invention includes the following steps:

s3031: activating a virtual environment, wherein the virtual environment is remotely created through a virtual environment management interface, and the virtual environment is preconfigured and is used for executing specific tasks or programs; dynamically loading an execution code, reading the execution code according to a dynamic loading instruction, dynamically loading or updating the execution code through a configuration port, reconfiguring programmable resources in the execution code, and instantiating the required execution code into an executable class;

s3032: calling a specific method of the executable class by using the pre-configured parameters to execute tasks;

s3033: after a return result of task execution is obtained, processing is carried out according to requirements; returning the result to the caller in the form of hypertext transfer protocol (HTTP); if the data volume is large, the data is stored to a temporary table through a built-in data falling table function, and the caller is informed of the storage information through a HTTP mode;

the working principle and beneficial effects of the technical scheme are as follows: the embodiment activates a virtual environment, the virtual environment is created remotely through a virtual environment management interface, and the virtual environment is preconfigured and is used for executing specific tasks or programs; dynamically loading an execution code, reading the execution code according to a dynamic loading instruction, dynamically loading or updating the execution code through a configuration port, reconfiguring programmable resources in the execution code, and instantiating the required execution code into an executable class; secondly, a specific method of the executable class is called by using the pre-configured parameters, and the task is executed; finally, after a return result of task execution is obtained, processing is carried out according to requirements; returning the result to the caller in the form of hypertext transfer protocol (HTTP); if the data volume is large, the data is stored to a temporary table through a built-in data falling table function, and the caller is informed of the storage information through a HTTP mode; the above scheme flexibility and reusability: the executable codes are packaged into executable classes, so that the codes are conveniently organized and multiplexed, and the flexibility and maintainability of the codes are improved; dynamic loading and configuration: the execution codes are dynamically loaded and dynamically loaded or updated according to the configured ports, and the execution codes are flexibly configured and reconfigured according to the needs, so that the method is suitable for different task demands; extensibility and customizable: calling a specific method of the executable class through the pre-configured parameters, executing various specific tasks, and processing a returned result according to requirements; meanwhile, a large amount of data is saved through the temporary table, and a calling party is informed of saving information in an HTTP mode, so that the expansibility and the customization of the system are improved. The embodiment processes and returns the result according to the requirements, and meets the requirements of different application scenes.

Example 7: as shown in fig. 9, on the basis of embodiment 1 to embodiment 6, the task scheduling service system based on container technology provided in the embodiment of the present invention includes:

the task processing module is in charge of pushing an execution request to a task scheduling service after a task is scheduled, calculating the conditions of resources, file resources and the like of a corresponding system according to the parameter information of the pushed task, adding a task queue to obtain a virtual environment corresponding to the task, and storing the scheduled parameters;

the working principle and beneficial effects of the technical scheme are as follows: the information processing module of the embodiment acquires the connection information of the data source, communicates by using the connection information, and calls up the corresponding code and the monitoring state; configuring a data source to obtain a managed template of a code; the environment creation module creates a new container through the basic mirror image and configures a Python package index source; creating and managing exclusive virtual environment; after a task is called, the task processing module pushes an execution request to a task scheduling service, calculates the conditions of resources, file resources and the like of a corresponding system according to the parameter information of the pushed task, adds a task queue to obtain a virtual environment corresponding to the task, and stores the scheduled parameters; the scheme adopts task scheduling execution service, and the cooling convenience of the container technology is utilized in the technical field of data processing, so that the integration of multiple virtual environments is realized, and stronger execution efficiency is provided; the expansion of multiple environments is realized, so that different users of the same platform have more choices, and the users can completely define respective virtual environments independently to execute different codes, thereby enabling sql, python, spark codes to be integrated rapidly in the same system; the diversity of functions greatly improves the corresponding code diversity, can also be rapidly supported on the subsequent expansion of other execution code classes, and can be rapidly invoked by a task mode by only installing different packages to create independent virtual environments; the effectiveness of environment isolation, because the code execution layer is completely stripped out and is completely submitted to the task scheduling service center for execution, the coupling with the original system is completely isolated, the stability and the safety of the system are greatly improved, the system problems caused in the process of executing task codes, such as os operation of python codes, are avoided, the breakdown of the system caused by incorrect operation is avoided, and compared with the system of an unserviceable execution component at present, the implementation efficiency, the safety, the diversity and the like of the embodiment are incomparable; the containerized deployment mode supports quick deployment and distributed deployment, when task execution services are deployed in a cluster mode, the execution efficiency is greatly improved, and compared with the prior art that new environments are reconfigured and redeployed manually according to environments required by users, the system has the characteristics of simplicity and rapidness in efficiency and simplicity; even code developers lacking related container learning background can rapidly deploy the required environment, so that the learning cost of the developers and the time cost consumed by deployment are saved to the maximum extent; the device is provided with various data source adaptation components, supports spark, hive, mysql and the like, and can perform the operations of adding, deleting and modifying various data sources without depending on other components; unlike prior art solutions, the coupled databases are connected into the system and depending on the particular circumstances, the user's existing database cannot be directly used as a data source. The data source adaptation and the code scheduling adaptation integrated in the embodiment enable old index output codes of the former developers to be invoked and scheduled without modification, and save modification cost and time.

Example 8: as shown in fig. 10, on the basis of embodiment 7, an information processing module provided in an embodiment of the present invention includes:

the configuration processing sub-module is in charge of carrying out the configuration of the type and the connection format on the data source, judging whether the configuration of the data source is correct, initializing the data source if the configuration of the data source is correct, reading the connection information of the data source, importing a data source management interface and carrying out connection permission verification, and informing operators that the configuration of the data source is incorrect if the configuration of the data source is incorrect;

the code support pipe module is in charge of establishing a data processing engine template after configuration is completed, configuring codes, input parameters, output parameters and basic information of the data processing engine, carrying out on-line coding of managed codes according to the codes, acquiring coding characteristics of the managed codes, and selecting a coding style function corresponding to a target coding mode in a preset coding function library based on the coding characteristics;

the information compiling sub-module is responsible for compiling the coding style function to obtain a managed coding function, and uniformly preprocessing codes to generate function call information corresponding to the managed coding function; utilizing the managed coding function to carry out managed compiling based on the function call information to obtain an operation information compiling file of the data processing engine; compiling a file according to the running information to generate application software of a data processing engine;

The working principle and beneficial effects of the technical scheme are as follows: the configuration processing submodule of the embodiment carries out the configuration of the type and the connection format on the data source, judges whether the configuration of the data source is correct, if the configuration of the data source is correct, initializes the data source, reads the connection information of the data source, imports a data source management interface and carries out connection permission verification, and if the configuration of the data source is incorrect, informs operators that the configuration of the data source is incorrect; after the code support pipe module is configured, a data processing engine template is established, codes, input parameters, output parameters and basic information of the data processing engine are configured, on-line coding of managed codes is carried out according to the codes, coding characteristics of the managed codes are obtained, and a coding style function corresponding to a target coding mode is selected from a preset coding function library based on the coding characteristics; the information compiling sub-module compiles the coding style function to obtain a managed coding function, and uniformly pre-processes codes to generate function call information corresponding to the managed coding function; utilizing the managed coding function to carry out managed compiling based on the function call information to obtain an operation information compiling file of the data processing engine; compiling a file according to the running information to generate application software of a data processing engine; the scheme realizes the acquisition of the data source connection information: information required by connecting an external data source, such as a host name, a port number, a user name, a password and the like, is obtained by analyzing the configuration information, and is a key for connecting the external data source, so that a data processing engine can accurately acquire index data; establishing a communication connection: through the data source connection information, each node in the data processing engine cluster can establish communication connection, realize the coordination of data transmission and tasks, transmit index data from the data source to each node, and perform distributed data processing and analysis in the cluster; the data processing efficiency is improved: by establishing communication connection, parallel processing of data is realized in the data processing engine cluster, and each node can process different data fragments simultaneously, so that the efficiency and speed of data processing are greatly improved, and the method is particularly important for large-scale data sets and complex data processing tasks; task coordination and allocation are realized: after communication connection is established, each node in the data processing engine cluster coordinates and distributes tasks, data and computing resources can be shared among the nodes, and the tasks are reasonably distributed according to the characteristics of the tasks and the load conditions of the nodes, so that more efficient data processing is realized. The embodiment realizes the coordination of data transmission and tasks in the data processing engine cluster, improves the efficiency and performance of data processing, and can better process and analyze big data; configuration and connection management of the data sources are realized, and template configuration and code generation of a data processing engine are realized; the correctness of the data source is ensured by configuring the data source, and the authority verification is carried out to ensure the reliability and the safety of the data; meanwhile, a data processing engine template is established and online coding is carried out, and a corresponding coding style function is generated according to requirements, so that flexible data processing operation is realized; and finally, the generated data processing engine application software improves the efficiency and accuracy of data processing and provides better data analysis and decision support. The embodiment provides a complete set of data processing flow and tool, which helps users to better manage and utilize data resources.

Example 9: as shown in fig. 11, on the basis of embodiment 7, an environment creation module provided in an embodiment of the present invention includes:

the container creation sub-module is responsible for acquiring environment information of the basic mirror image, creating and starting a new container through the basic mirror image, wherein the environment information comprises a basic environment, a language, a calculation package, a connection package and the like;

the service configuration sub-module is in charge of the container management service to configure a Python packet index source, and the Python packet index source supports downloading and installing a new Python packet; adding through a mirror image container management tool page, and creating a personal virtual environment, wherein the personal virtual environment generates a virtual environment in a new container in real time, and the virtual environment creates an independent Python environment which contains a Python interpreter of a specific version and a package required by an item;

the working principle and beneficial effects of the technical scheme are as follows: the container creation submodule of the embodiment obtains the environment information of the basic mirror image, creates and opens a new container through the basic mirror image, and the environment information comprises a basic environment, a language, a calculation packet, a connection packet and the like; the container scheduling sub-module constructs a basic mirror image through a mirror image container management tool, stores the basic mirror image in a mirror image warehouse, and creates a container group to obtain a new container; dispatching to a specific container management service according to the created container group, and calling a server by the container management service; the service configuration submodule container management service configures a Python package index source which supports downloading and installing a new Python package; adding through a mirror image container management tool page, and creating a personal virtual environment, wherein the personal virtual environment generates a virtual environment in a new container in real time, and the virtual environment creates an independent Python environment which contains a Python interpreter of a specific version and a package required by an item; by creating the personal virtual environment, the scheme can isolate the development environment of the user from the system environment, avoid the influence of the operation of the user on the system environment, and simultaneously avoid the interference of the change of the system environment on the development work of the user; the personal virtual environment allows a user to install a Python interpreter of a specific version and packages required by the items in the personal virtual environment, so that the user can conveniently manage the dependency relationship of the items, dependency conflict among different items is avoided, and each item can be ensured to normally run; through the personal virtual environment, a user can run a plurality of projects on the same machine, each project has an independent environment, and the user can rapidly switch the development environments of different projects, so that the development efficiency and flexibility are improved; the personal virtual environment can be packaged into a mirror image and stored in a mirror image warehouse, and when the project is deployed, the environment required by the project can be quickly built only by deploying the mirror image into the target environment, so that the deployment process is simplified. The embodiment creates a personal virtual environment, provides an independent and isolated development environment, is convenient for users to manage dependence, improves flexibility, simplifies the deployment process of projects, and is very significant for developers.

Example 10: as shown in fig. 12, on the basis of embodiment 7, a task processing module provided in an embodiment of the present invention includes:

the instruction processing sub-module is responsible for obtaining configuration instructions of the message queue RabbitMQ, constructing a RabbitMQ management center according to the configuration instructions, obtaining path configuration information pushed by task information of the queue, monitoring the queue according to the path configuration information, and obtaining a monitoring result by the task scheduling service; after acquiring task information, calculating the conditions of required system resources, file resources and the like according to parameters, including CPU core number, memory consumption, hard disk resources and the like, and adding tasks into a task queue;

the task execution sub-module is responsible for queuing and waiting to be executed according to the sequence, wherein the task queue follows the principle of first-in first-out; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; according to the parameters, different execution classes are called to execute tasks;

the working principle and beneficial effects of the technical scheme are as follows: the information pushing submodule of the embodiment receives a request for task call and pushes task information to the message queue RabbitMQ; the instruction processing sub-module obtains a configuration instruction of the message queue RabbitMQ, constructs a RabbitMQ management center according to the configuration instruction, obtains path configuration information pushed by task information of the queue, monitors the queue according to the path configuration information, and the task scheduling service obtains a monitoring result; after acquiring task information, calculating the conditions of required system resources, file resources and the like according to parameters, including CPU core number, memory consumption, hard disk resources and the like, and adding tasks into a task queue; the task queue of the task execution submodule follows the principle of first-in first-out, and waits to be executed according to the sequence; when the task is queued to be executed, the service acquires the task type and the scheduling type according to the parameters of the task, and the required virtual environment and the saved scheduling parameters; according to the parameters, different execution classes are called to execute tasks; the scheme improves the task scheduling efficiency: the task information is pushed to the message queue, and the task information is acquired through the monitoring queue, so that asynchronous processing and concurrent execution of the task are realized, and the task scheduling efficiency is improved; flexibly configuring task scheduling: the execution mode and the path of the task are flexibly configured according to specific requirements through configuration instructions and path configuration information, so that the task scheduling is more flexible and controllable; resource management and scheduling: the system resources are reasonably managed and scheduled by calculating the conditions of required system resources, file resources and the like, including the number of CPU cores, the memory consumption, hard disk resources and the like, so that resource overload or waste is avoided, and the overall performance of the system is improved; task queue management: orderly execution of tasks is realized through a first-in first-out principle of a task queue, so that the loss or disorder of the tasks is avoided, and the accuracy and the integrity of task execution are ensured; support multiple task types and scheduling types: and according to the parameters of the task and the scheduling requirements, different execution classes are called to execute the task, and flexible application of various task types and scheduling types is supported. The embodiment provides an efficient, flexible and controllable task scheduling service, which can effectively manage and schedule tasks and improve the overall performance of a system and the accuracy of task execution.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A task scheduling service method based on container technology, which is characterized by including the following steps:

Obtain the data source connection information, use the connection information to communicate, and call up the corresponding code and monitoring status; configure the data source and obtain the managed template of the code;

Create a new container through the basic image and configure the Python package index source; at the same time, create and manage a dedicated virtual environment;

When a task is called up, push the execution request to the task scheduling service, calculate the corresponding system resources and file resources based on the parameter information of the pushed task, add the task queue to obtain the virtual environment corresponding to the task, and save the scheduling parameters;

The process of obtaining the managed template of the code includes the following steps:

Configure the data source type and connection format, and determine whether the data source configuration is correct. If the data source configuration is correct, initialize the data source, read the data source connection information, import the data source management interface, and perform connection permission calibration. Verification, if the configuration of the data source is incorrect, notify the operator that the configuration of the data source is incorrect;

After the configuration is completed, establish a data processing engine template, configure the code, input parameters, output parameters and basic information of the data processing engine, perform online coding of the managed code according to the code, obtain the coding characteristics of the managed code, and use the preset coding function based on the coding characteristics. Select the coding style function corresponding to the target coding method in the library;

Compile the coding style function to obtain the managed coding function, and uniformly preprocess the code to generate the function call information corresponding to the managed coding function; use the managed coding function to perform managed compilation based on the function call information, and obtain the running information compiled file of the data processing engine ; Application software that compiles files to generate data processing engines based on running information.

2. The task scheduling service method based on container technology as claimed in claim 1, characterized in that the process of creating a new container through a basic image includes the following steps:

Obtain the environment information of the basic image, create and start a new container through the basic image, the environment information includes the basic environment, language, computing package and connection package;

Build a basic image through the image container management tool, store it in the image warehouse, create a container group, and obtain a new container; schedule it to a specific container management service according to the created container group, and the container management service calls the server;

The container management service configures the Python package index source. The Python package index source supports downloading and installing new Python packages; it is added through the image container management tool page and creates a personal virtual environment. The personal virtual environment generates a virtual environment in a new container in real time. Environment creates a standalone Python environment that contains the Python interpreter and the packages required for the project.

3. The task scheduling service method based on container technology as claimed in claim 1, characterized in that the creation process of the virtual environment includes the following steps:

Receive a request to create a virtual environment, select a base image based on the environment information of the virtual environment, and create a new container based on the base image;

Parse the request to get the number of virtual environments created, divide the new container into a first basic container and a second basic container, use the second basic container to build a basic environment, and get the Python interpreter and dependency packages;

Use the first basic container to build a virtual environment and a basic environment, configure the Python interpreter in the virtual environment, and create a virtual environment based on the configured environment information and dependency packages.

4. The task scheduling service method based on container technology as claimed in claim 1, characterized in that the process of adding a task queue to obtain the virtual environment corresponding to the task and saving the scheduling parameters includes the following steps:

Receive the request to activate the task and push the task information to the message queue RabbitMQ;

Obtain the configuration instructions of the message queue RabbitMQ, build the RabbitMQ management center according to the configuration instructions, obtain the path configuration information for pushing the task information of the queue, monitor the queue according to the path configuration information, and the task scheduling service obtains the monitoring results; after obtaining the task information, according to Parameter calculation requires system resources and file resources, including the number of CPU cores, memory usage and hard disk resources, and adds the task to the task queue;

The task queue follows the first-in-first-out principle and queues up in order to wait for execution; when the task is queued for execution, the service obtains the task type and scheduling type according to the parameters of the task, as well as the required virtual environment and saved scheduling parameters; according to the parameters, call Different execution classes perform task execution.

5. The task scheduling service method based on container technology as claimed in claim 1, characterized in that the process of calling different execution classes for task execution includes the following steps:

Activate the virtual environment. The virtual environment is created remotely through the virtual environment management interface. The virtual environment is pre-configured and used to perform specific tasks or programs; dynamically load the execution code, read the execution code according to the dynamic loading instructions, and execute the execution through the configuration port The code is dynamically loaded or updated, its internal programmable resources are reconfigured, and the required execution code is instantiated into an executable class;

Use preconfigured parameters to call specific methods of the executable class to perform tasks;

After obtaining the return result of the task execution, process it according to the needs; return the result to the caller in the form of Hypertext Transfer Protocol HTTP, or save the data to a temporary table through the built-in data table function, and use the Hypertext Transfer Protocol The HTTP method tells the caller to save the information.

6. A task scheduling service system based on container technology, which is characterized by including:

The information processing module is responsible for obtaining the data source connection information, using the connection information to communicate, and calling up the corresponding code and monitoring status; configuring the data source and obtaining the hosting template of the code;

The environment creation module is responsible for creating new containers through basic images and configuring Python package index sources; at the same time, creating and managing exclusive virtual environments;

The task processing module is responsible for pushing execution requests to the task scheduling service when a task is called up. Based on the parameter information of the pushed task, it calculates the resources and file resources of the corresponding system, adds a task queue to obtain the virtual environment corresponding to the task, and Save scheduling parameters;

Environment creation module, including:

The container creation submodule is responsible for obtaining the environment information of the basic image, creating and opening a new container through the basic image. The environment information includes the basic environment, language, computing package and connection package;

The container scheduling sub-module is responsible for building a basic image through the image container management tool, storing it in the image warehouse, creating a container group, and obtaining a new container; scheduling to a specific container management service based on the created container group, and the container management service calls the server;

The service configuration submodule is responsible for container management service configuration Python package index source. The Python package index source supports downloading and installing new Python packages; it is added through the image container management tool page and creates a personal virtual environment. The personal virtual environment is in a new container. Generate a virtual environment in real time. The virtual environment creates an independent Python environment that contains the Python interpreter and the packages required for the project.

7. The task scheduling service system based on container technology as claimed in claim 6, characterized in that the task processing module includes:

The information push sub-module is responsible for receiving the task call request and pushing the task information to the message queue RabbitMQ;

The instruction processing sub-module is responsible for obtaining the configuration instructions of the message queue RabbitMQ, building the RabbitMQ management center according to the configuration instructions, obtaining the path configuration information for pushing the task information of the queue, monitoring the queue according to the path configuration information, and the task scheduling service obtains the monitoring results; obtain After obtaining the task information, calculate the required system resources and file resources according to the parameters, including the number of CPU cores, memory usage and hard disk resources, and add the task to the task queue;

The task execution submodule is responsible for the task queue following the first-in-first-out principle and queuing up in order for execution; when the task is queued for execution, the service obtains the task type and scheduling type according to the parameters of the task, as well as the required virtual environment and saved schedule. Parameters; according to the parameters, different execution classes are called for task execution.