CN112398668A

CN112398668A - IaaS cluster-based cloud platform and node switching method

Info

Publication number: CN112398668A
Application number: CN201910749798.2A
Authority: CN
Inventors: 刘志鹏
Original assignee: Kyland Technology Co Ltd
Current assignee: Kyland Technology Co Ltd
Priority date: 2019-08-14
Filing date: 2019-08-14
Publication date: 2021-02-23
Anticipated expiration: 2039-08-14
Also published as: CN112398668B

Abstract

The embodiment of the present invention discloses an IaaS cluster-based cloud platform and a node switching method. The cloud platform based on the IaaS cluster includes: an application high availability layer and a data high availability layer, and the application high availability layer is connected to the data high availability layer. On the one hand, through the joint use of the application high-availability layer and the data high-availability layer, when a node failure is detected in the application high-availability layer, the high availability of the program can be achieved by deploying a single node. The deployment of cluster applications improves the deployment efficiency of cluster applications.

Description

IaaS cluster-based cloud platform and node switching method

Technical Field

The embodiment of the invention relates to a cloud computing technology, in particular to a cloud platform and node switching method based on an IaaS cluster.

Background

In recent years, cloud computing technology has been developed, which allows users to access a dynamically configurable pool of shared computing resources (including network devices, servers, storage, applications, and services) via a ubiquitous, convenient network, and enables rapid distribution of these configurable computer resources with minimal administrative overhead or service provider interaction complexity. Based on the advantages, enterprises need to consider whether to go to cloud, how to go to cloud and the like.

In the prior art, in a conventional cluster application, a deployment mode is a multi-node deployment program, and a back-end storage used by the program may be a cluster, so that a cluster-in-cluster mode can be formed. This model is robust, but due to its complex composition structure, the maintenance cost is multiplied, the deployment process is very complex, and as applications become more and more, clusters are also increasing, and these cluster applications have entered a state that is difficult to maintain and difficult to migrate.

Disclosure of Invention

The embodiment of the invention provides a cloud platform and node switching method based on an IaaS cluster, which changes the existing cluster application deployment mode, improves the cluster application deployment efficiency and realizes high availability of application and storage.

In a first aspect, an embodiment of the present invention provides a cloud platform based on an IaaS cluster, where the cloud platform includes: the system comprises an application program high-availability layer and a data high-availability layer, wherein the application program high-availability layer is connected with the data high-availability layer;

the application program high-availability layer runs on an infrastructure as a service (IaaS) cluster and comprises a plurality of nodes; a storage volume is arranged in the node, and the node is used for managing application programs running on the node through the storage volume;

the data high-availability layer is used for storing storage volumes of all nodes running the application programs in the application program high-availability layer;

the application program high-availability layer is used for selecting a new node from the application program high-availability layer when a fault node is detected; acquiring a storage volume corresponding to the failed node from the data high-availability layer, and deploying the storage volume in the new node; and the application program running on the failed node is redeployed on the new node, so that the application program running on the failed node is transferred to run on the new node.

In a second aspect, an embodiment of the present invention provides a method for switching a node, where the method includes:

when the application program high-availability layer detects a fault node, selecting a new node from the application program high-availability layer;

the high-availability layer of the application program acquires a storage volume corresponding to the failed node from the high-availability layer of the data and deploys the storage volume in the new node;

and the application program high-availability layer redeploys the application program running on the fault node on the new node so as to transfer the application program running on the fault node to the new node to run.

In the technical solution of the embodiment of the present invention, the cloud platform based on the IaaS cluster includes: the application program high-availability layer and the data high-availability layer are used together, when the fact that a node in the application program high-availability layer fails is detected, a new node is reselected from the application program high-availability layer, a storage volume corresponding to the failed node is obtained from the data high-availability layer and is deployed in the new node, and finally the application program running on the failed node is redeployed on the new node, so that the application program running on the failed node is transferred to the new node to run.

Drawings

Fig. 1 is a schematic structural diagram of a cloud platform based on an IaaS cluster in an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a cloud platform based on an IaaS cluster in the second embodiment of the present invention;

fig. 3 is a flowchart of a node handover method in the third embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a schematic structural diagram of an IaaS cluster-based cloud platform in an embodiment of the present invention, where the technical solution of this embodiment is suitable for a case where a high-availability cluster application is created in an IaaS cluster, the IaaS cluster-based cloud platform includes an application high-availability layer 1 and a data high-availability layer 2, and the application high-availability layer 1 is connected to the data high-availability layer 2.

The application program high-availability layer 1 runs on an infrastructure as a service (IaaS) cluster and comprises a plurality of nodes; the node is internally provided with a storage volume and is used for managing the application programs running on the node through the storage volume.

Among them, Infrastructure as a Service (IaaS) is the bottom layer of cloud Service, and mainly provides some basic resources for users, such as processing, storage, network and other basic computing resources, which can be used to deploy and execute operating systems or applications.

In this embodiment, a virtual machine cluster for supporting a specific service is first created on an IaaS cluster, and then an application high-availability layer 1 and a data high-availability layer 2 are constructed on the created virtual machine cluster. The application high availability layer 1 comprises a plurality of nodes, each node runs an application deployed in the node, meanwhile, each node is mounted with a storage volume for storing data information related to the application running in the node, the storage volume can exist in the corresponding node in a form independent of the application high availability layer 1, the life cycle of the storage volume is irrelevant to the mounted node, and after the current node is down, the data volume is not deleted, but can be mounted to other nodes for data storage.

Optionally, the application high availability layer is implemented by a container cluster, and the data high availability layer is implemented by a storage cluster.

Optionally, the container cluster uses Swarm or Kubernetes.

In the two optional embodiments, a specific manner for implementing the high availability layer of the application program and the high availability layer of the data is provided, and the high availability of the application program may be implemented by using the functions of detecting and managing a plurality of nodes of the container cluster, and at the same time, the high availability of the data may be implemented by using the storage cluster. In addition, in order to adapt to such an IaaS cluster-based cloud platform, when high availability of applications is achieved using a container cluster, health check of the container cluster should be configured to be highly sensitive, that is, to provide a higher frequency of health check on the basis of the original health check.

Both Swarm and Kubernetes are cluster management tools, and are used for abstracting a virtual machine cluster into a whole and uniformly managing containerization applications on a plurality of hosts in a cloud platform. For example, after the cluster management tool detects that a certain node in the container cluster is down, the application program running on the node may be redeployed to other nodes to continue running.

And the data high-availability layer 2 is used for storing the storage volume of each node running the application program in the application program high-availability layer 1.

The data high-availability layer 2 is also established in a virtual machine cluster supporting specific services in the IaaS cluster, storage spaces in a plurality of devices are aggregated into a storage pool capable of providing a uniform access interface and a management interface for an application server, data can be stored and read from the plurality of storage devices according to a certain rule, and the performance of the storage devices and the utilization rate of a disk are fully exerted. In this embodiment, the data high availability layer 2 is used to store and read the storage volume mounted in each node in the application program high availability layer 1, and the reading and the storing of the storage volume mounted in the node can be realized through the communication connection between the application program high availability layer 1 and the data high availability layer 2.

It should be noted that, in this embodiment, the data high availability layer 2 is also deployed in a virtual machine cluster supporting a specific service in the IaaS cluster, and in some other cases, the data high availability layer 2 may also be deployed independently from the IaaS cluster.

The application program high-availability layer 1 is used for selecting a new node from the application program high-availability layer 1 when a fault node is detected; acquiring a storage volume corresponding to the failed node from the data high-availability layer 2, and deploying the storage volume in the new node; and the application program running on the failed node is redeployed on the new node, so that the application program running on the failed node is transferred to run on the new node.

In this embodiment, when the high-availability layer 1 of the application detects that a node running a certain application program fails, a new node is reselected to replace the failed node to continue running the application program, specifically, after the new node is selected, the storage volume mounted on the failed node is read from the high-availability layer 2 of the data, and is re-mounted on the selected new node, and then the application program running on the failed node is redeployed to the new node, and in the new node, the application program is rerun according to the data stored in the storage volume, thereby realizing high availability of the application.

In the technical solution of the embodiment of the present invention, the cloud platform based on the IaaS cluster includes: the application program high-availability layer and the data high-availability layer are used together, when the fact that a node in the application program high-availability layer fails is detected, a new node is reselected from the application program high-availability layer, a storage volume corresponding to the failed node is obtained from the data high-availability layer and is deployed in the new node, and finally the application program running on the failed node is redeployed on the new node, so that the application program running on the failed node is transferred to the new node to run, the structural separation of the application program high-availability layer and the data high-availability layer is realized, and the high availability of the application program is further realized.

Example two

Fig. 2 is a schematic structural diagram of an IaaS cluster-based cloud platform in an embodiment of the present invention, where this embodiment is applicable to a case where a high-availability cluster application is created in an IaaS cluster, and further detailed on the basis of the above embodiment, and a detailed explanation is performed on a cloud platform structure based on an IaaS cluster.

Optionally, the plurality of nodes specifically include: at least one host node 11 and at least one compute node 12;

the primary node 11 is internally provided with an agent 111, and is configured to deploy at least one application created on the cloud platform on the primary node and/or the computing node by running the agent 111.

In this alternative embodiment, the application high availability layer 1 includes a plurality of nodes, and the nodes are divided into two categories, namely, a master node 11 and a compute node 12, where the master node 11 refers to a node in which an agent 111 is embedded, and the agent 111 is used to deploy an application in each node. Illustratively, as shown in fig. 2, the node containing the agent in the application high availability layer 1 is a master node 11, and both the computing node B and the computing node C are computing nodes 12.

It is understood that the agent 111 only plays a role of deploying the application during the deployment of the application, and after each node starts to run the application, the agent 111 does not play a role any more, and at this time, the above-mentioned master node 11 also plays a role of the other computing nodes 12 for running the application deployed thereon.

Optionally, the cloud platform further includes: at least one customer premise equipment (3), wherein the customer premise equipment (3) is connected with a master node (11) in the application program high-availability layer (1);

the client device 3 is internally provided with a management program 31, and is configured to execute the management program 31, fill the application template according to the received application parameter input by the user, and provide the filled application template to the agent 111, so as to publish at least one application program on the cloud platform;

the agent program 111 is specifically configured to parse the filled application template sent by the management program 31 to obtain an executable code; providing the executable code to the application high availability layer 1 such that the application high availability layer 1 creates the at least one application by executing the executable code.

In this optional embodiment, the cloud platform further includes at least one user end device 3, and a user may input an application parameter for deploying the application program through the user end device 3 and send the application parameter to the high availability layer 1 of the application program for corresponding processing, so as to finally implement deployment of the application program. Specifically, the user end device 3 is internally provided with a management program 31, the management program 31 perfects a preset application template according to application parameters of the deployment application program input by a user, and finally sends the application template including the application parameters to the agent 111 internally provided in the host node, where the application template is a configuration file including basic parameters of the application program and can be analyzed by a preset driver to obtain an executable code. After receiving the application template including the application parameters, the agent 111 may compile the application template into an executable code of the application high availability layer 1 through a preset driver, and further send the executable code to the application high availability layer 1, where the application high availability layer 1 creates at least one application program in each node by running the executable code.

Optionally, the user end device 3 further includes a Web program 32, configured to receive an application parameter input by a user, and send the application parameter to the management program 31, so as to fill the application template according to the application parameter.

Optionally, the management program 31 and the agent program 111 are communicatively connected by a message queue.

Optionally, the message queue adopts a RabbitMQ system.

In both alternative embodiments, the manager 31 may send the template containing the application parameters to the agent 111 by means of a message queue, for example, using the RabbitMQ system.

Optionally, storage of the application high availability layer 1 uses a circle volume, and the circle volume uses a Ceph system to realize multi-copy storage.

Optionally, the application high availability layer 1 and the data high availability layer 2 are in communication connection through a Rexray plug-in.

Connection between the data volume mounted by each node and the data high availability layer circular volume can be realized by using a Rexray plug-in.

The sender is an abstraction of a layer of 'logical storage volume' introduced between the virtual machine and the specific storage device, manages corresponding back-end storage by calling driving interfaces of different storage back-end types, and provides a storage interface of unified volume related operation for a user. The Ceph is a distributed file system and can realize multi-copy storage, that is, the application of each node in the high availability layer 1 of the application program actually uses one storage volume, but the actual storage back end of the storage volume is in a multi-copy mode by using the Ceph system. Illustratively, as shown in fig. 2, the actual data high availability layer 2 includes three storage nodes A, B and C, i.e., the 3-copy storage of the back end is realized by the Ceph system.

The technical scheme of this embodiment, a cloud platform based on an IaaS cluster includes: the method comprises the steps that an application program high-availability layer and a data high-availability layer are used together, on one hand, the application program running on a node can be redeployed to other nodes to continue running when the node in the application program high-availability layer is detected to be in fault, and high availability of the application is achieved.

EXAMPLE III

Fig. 3 is a flowchart of a node switching method in the third embodiment of the present invention, where the technical solution in this embodiment is suitable for a case where a high-availability cluster application is created in an IaaS cluster, and the method may be applied to the cloud platform based on the IaaS cluster, and specifically includes the following steps:

and step 310, when the application program high-availability layer detects a fault node, selecting a new node in the application program high-availability layer.

In this embodiment, after detecting that a node running a certain program is down, the high-availability layer of the application program determines that the node is a failed node, and then reselects a new node from the high-availability layer of the application program to replace the failed node, so as to continue to execute the application program running the failed node, thereby achieving high availability of the application program. For example, the failure of the node may be a power failure of the node or a system crash, which is not specifically limited in this embodiment.

And step 320, the application program high-availability layer acquires the storage volume corresponding to the failed node from the data high-availability layer and deploys the storage volume in the new node.

In this embodiment, after the high availability layer of the application program determines a new node that replaces the failed node, the storage volume mounted by the failed node may be acquired from the high availability layer of the data through RexRay, and the storage volume is mounted on the new node again, so as to provide the new node with data information related to the application program run by the failed node.

And step 330, the application program high-availability layer redeploys the application program running on the failed node on the new node, so that the application program running on the failed node is transferred to run on the new node.

In this embodiment, after the storage volume corresponding to the failed node is mounted to the new node, the application program running on the failed node is deployed to the new node by the application program high availability layer, so that the application program can continue to run in the application program high availability layer.

Optionally, the redeploying, by the application high availability layer, the application running on the failed node on the new node, so that the application running on the failed node is transferred to the new node to run, further comprising:

the application program high-availability layer dispatches the virtual IP corresponding to the application program running on the failed node to the new node;

and the high-availability layer of the application program adopts an Ingress technology to forward the flow corresponding to the fault node to the new node.

Where a virtual IP address is an IP address that is not connected to a particular computer or network interface card in a computer, one virtual IP address may be used by another node in the event of a failure of the current node. Illustratively, each node has a particular physical IP address, and also has a virtual IP address, which is a working IP address. Illustratively, the physical IP address of the node 1 is 200.10.10.1, the physical IP address of the node 2 is 200.10.10.2, at this time, there is a virtual IP address of 200.10.10.3, when an application program runs on the node 1, the node 1 uses the virtual IP address, that is, the working IP address of the node 1 under normal working condition is 200.10.10.3, when the node 1 fails, the node 2 is handed over to continue to run the program running in the node 1, at this time, the virtual IP address also drifts to the node 2, at this time, the working IP address of the node 2 is 200.10.10.3.

In this optional embodiment, after a node running a certain application program fails, the application program high-availability layer deploys the application program running on the failed node to a new node to continue running, and accordingly, the virtual IP address corresponding to the failed node also drifts to the new node, and meanwhile, the application program high-availability layer correspondingly forwards the traffic corresponding to the failed node to the new node by using an Ingress technology, so as to implement load balancing.

According to the technical scheme, when the high-availability layer of the application program detects a fault node, a new node is selected from the high-availability layer of the application program, the storage volume corresponding to the fault node is obtained from the high-availability layer of the data, the storage volume is deployed in the new node, and finally the application program running on the fault node is re-deployed on the new node, so that the application program running on the fault node is transferred to the new node to run, and the high availability of the application program is realized.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A cloud platform based on an IaaS cluster, comprising: the system comprises an application program high-availability layer and a data high-availability layer, wherein the application program high-availability layer is connected with the data high-availability layer;

2. The cloud platform of claim 1, wherein the application high availability layer is implemented by a container cluster and the data high availability layer is implemented by a storage cluster.

3. The cloud platform of claim 1, wherein the plurality of nodes specifically comprise: at least one master node and at least one compute node;

and the main node is internally provided with an agent program and is used for deploying at least one application program created on the cloud platform on the main node and/or the computing node by running the agent program.

4. The cloud platform of claim 3, wherein the cloud platform further comprises: at least one customer premise equipment, wherein the customer premise equipment is connected with the main node in the high-availability layer of the application program;

the client device is internally provided with a management program and used for filling the application template according to the received application parameters input by the user by operating the management program and providing the filled application template for the agent program so as to release at least one application program on the cloud platform;

the agent program is specifically configured to parse the filled application template sent by the management program to obtain an executable code; providing the executable code to the application high availability layer such that the application high availability layer creates the at least one application by executing the executable code.

5. The cloud platform of claim 1, wherein said application high availability layer and said data high availability layer are communicatively connected via a RexRay plug-in.

6. The cloud platform of claim 1, wherein the storage of the application high availability layer uses a Cinder volume, while the Cinder volume uses Ceph technology to implement multi-copy storage.

7. The cloud platform of claim 4, wherein the hypervisor and the agent are communicatively coupled via a message queue.

8. The cloud platform of claim 2, wherein said container cluster employs Swarm or kubernets.

9. A node switching method applied to the IaaS cluster-based cloud platform according to any one of claims 1 to 8, comprising:

10. The method of claim 9, wherein the application high availability layer redeploys the application running on the failed node to the new node such that the application running on the failed node is transferred to run on the new node, further comprising:

the application program high-availability layer allocates the virtual IP corresponding to the application program running on the failed node to the new node;