CN119232657B - Server, communication control method, communication method, device, medium, and product - Google Patents
Server, communication control method, communication method, device, medium, and product Download PDFInfo
- Publication number
- CN119232657B CN119232657B CN202411735904.9A CN202411735904A CN119232657B CN 119232657 B CN119232657 B CN 119232657B CN 202411735904 A CN202411735904 A CN 202411735904A CN 119232657 B CN119232657 B CN 119232657B
- Authority
- CN
- China
- Prior art keywords
- data
- card
- routing configuration
- accelerator
- accelerator card
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 95
- 238000004891 communication Methods 0.000 title claims abstract description 64
- 230000001133 acceleration Effects 0.000 claims abstract description 74
- 238000001514 detection method Methods 0.000 claims abstract description 46
- 230000005540 biological transmission Effects 0.000 claims description 59
- 230000015654 memory Effects 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 abstract description 17
- 238000007726 management method Methods 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000012986 modification Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 230000003111 delayed effect Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 238000013475 authorization Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/122—Avoiding congestion; Recovering from congestion by diverting traffic away from congested entities
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention relates to the technical field of servers, and discloses a server, a communication control method, a communication method, equipment, media and products, wherein the server comprises a baseboard management controller and a plurality of acceleration cards connected with a network switch, and the method is applied to the baseboard management controller; the method comprises the steps of receiving a first congestion detection result sent by a network switch, if a first acceleration card with data congestion exists in a server according to the first congestion detection result, searching a second acceleration card without the data congestion, modifying a first data routing configuration which points to the local in the first acceleration card into a second data routing configuration which points to the second acceleration card, so that the first acceleration card sends data to be sent to the network switch to the second acceleration card based on the second data routing configuration, and sending the data to the network switch through the second acceleration card. The real-time performance of network congestion processing can be improved.
Description
Technical Field
The present disclosure relates to the technical field of servers, and in particular, to a server, a communication control method, a communication method, a device, a medium, and a product.
Background
With the development of technologies such as data analysis, high-performance computing, and artificial intelligence, the scale of data centers is gradually increasing. As a facility for processing a large amount of data information, once a data center has a problem of network congestion, a problem of data packet loss or network failure is caused, and thus a serious influence is caused. Therefore, how to construct a high-performance data center GPU DIRECT RDMA network becomes a core technology that needs to be focused on.
Currently, in some technologies, a control method based on ECN (Explicit Congestion Notification ) algorithm is used to alleviate the network congestion problem of the data center. However, the method relies on congestion feedback of the data receiving end to the data transmitting end, and once the feedback of the data receiving end is delayed, the data transmitting end cannot respond in time, so that the real-time performance of network congestion processing is affected.
Disclosure of Invention
In view of this, the present disclosure provides a server, a communication control method, a communication method, an electronic device, a computer-readable storage medium, and a computer program product, which can improve the instantaneity of network congestion processing.
In a first aspect, the present disclosure provides a server comprising:
A data transmission channel;
The accelerating cards comprise data routing configurations, wherein for any accelerating card, when the data routing configuration in the accelerating card is directed to a first data routing configuration of the local, the accelerating card sends data to the network switch through a local network interface card, and when the data routing configuration in the accelerating card is directed to a second data routing configuration of other target accelerating cards, the accelerating card sends data to the target accelerating card through the data transmission channel and sends data to the network switch through the target accelerating card;
and the baseboard management controller is used for connecting the network switch and each acceleration card, and modifying the data routing configuration in the acceleration card under the condition that the specified condition is met, wherein the initial data routing configuration in each acceleration card is the first data routing configuration.
In a second aspect, the present disclosure provides a communication control method, where the method is applied to a baseboard management controller in the server, and the method includes:
Receiving a first congestion detection result sent by the network switch;
If the first acceleration card with the data congestion exists in the server according to the first congestion detection result, searching a second acceleration card without the data congestion;
Modifying a first data routing configuration of the first accelerator card, which points to the local, to a second data routing configuration of the second accelerator card, so that the first accelerator card sends data to be sent to the network switch to the second accelerator card based on the second data routing configuration, and sends the data to the network switch through the second accelerator card.
In a third aspect, the present disclosure provides a communication method, where the method is applied to a first accelerator card in the server, and the method includes:
when generating target data to be sent to a network switch, acquiring local data routing configuration;
If the data routing configuration is a second data routing configuration pointing to a second accelerator card, determining that a problem of data congestion occurs locally, and sending the target data to the second accelerator card so as to send the target data to the network switch through the second accelerator card.
In a fourth aspect, the present disclosure provides a communication control apparatus applied to a baseboard management controller in the server, the apparatus including:
the detection result receiving module is used for receiving a first congestion detection result sent by the network switch;
The acceleration card state identification module is used for searching a second acceleration card without data congestion if the first acceleration card with the data congestion exists in the server according to the first congestion detection result;
And the route configuration modification module is used for modifying the first data route configuration which points to the local in the first accelerator card into the second data route configuration which points to the second accelerator card, so that the first accelerator card sends the data to be sent to the network switch to the second accelerator card based on the second data route configuration, and sends the data to the network switch through the second accelerator card.
In a fifth aspect, the present disclosure provides a communication apparatus applied to a first accelerator card in the above server, the apparatus comprising:
The route configuration acquisition module is used for acquiring local data route configuration when generating target data to be sent to the network switch;
And the data sending module is used for determining that the problem of data congestion occurs locally if the data routing configuration is a second data routing configuration pointing to a second accelerator card, and sending the target data to the second accelerator card so as to send the target data to a network switch through the second accelerator card.
In a sixth aspect, the present disclosure provides an electronic device, including a memory and a processor, where the memory and the processor are communicatively connected to each other, and the memory stores computer instructions, and the processor executes the computer instructions, thereby performing the above communication control method, or performing the above communication method.
In a seventh aspect, the present disclosure provides a computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the above-described communication control method, or to execute the above-described communication method.
In an eighth aspect, the present disclosure provides a computer program product comprising computer instructions for causing a computer to perform the above-described communication control method, or to perform the above-described communication method.
In the technical solutions of some embodiments of the present disclosure, after determining, based on a first congestion detection result sent by a network switch, a first accelerator card with data congestion in a server, data to be sent to the network switch may be sent to a second accelerator card with no data congestion by modifying a data routing configuration in the first accelerator card, and the data may be sent to the network switch by the second accelerator card. In the second aspect, based on the congestion detection result sent by the network switch, the operation of relieving the data congestion problem can be started to be executed at the source of data transmission without waiting for the feedback of a data receiver, so that the problem that the network congestion processing is not timely enough due to the feedback delay of the data receiver can be avoided, namely, the real-time performance of the network congestion processing can be improved in the scheme of the present disclosure.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the related art, the drawings that are required to be used in the description of the embodiments or the related art will be briefly described below, and it is apparent that the drawings in the following description are some embodiments of the present disclosure, and other drawings may be obtained according to the drawings without inventive effort to those of ordinary skill in the art.
FIG. 1 is a schematic diagram of a server under a data center GPU DIRECT RDMA network architecture provided by some techniques;
FIG. 2 is a schematic diagram of a server under a data center GPU DIRECT RDMA network architecture provided by one embodiment of the present disclosure;
FIG. 3 is a flow chart of a communication control method provided by one embodiment of the present disclosure;
FIG. 4 is a flow diagram of a communication method provided by one embodiment of the present disclosure;
FIG. 5 is a block diagram of a communication control device provided in one embodiment of the present disclosure;
FIG. 6 is a block diagram of a communication device provided by one embodiment of the present disclosure;
Fig. 7 is a schematic structural diagram of an electronic device provided by some embodiments of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. Based on the embodiments in this disclosure, all other embodiments that a person skilled in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been illustrated in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather, these embodiments are provided so that this disclosure will be more thorough and complete. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions are also possible below.
In this context, unless explicitly stated otherwise, performing a step "in response to a" does not mean that the step is performed immediately after "a", but may include one or more intermediate steps.
It will be appreciated that the data (including but not limited to the data itself, the acquisition, use, storage or deletion of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.
It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the relevant users, which may include any type of rights subjects, such as individuals, enterprises, groups, etc., should be informed and authorized by appropriate means of the types of information, usage ranges, usage scenarios, etc. involved in the present disclosure according to relevant legal regulations.
For example, in response to receiving an active request from a user, prompt information is sent to the relevant user to explicitly prompt the relevant user that the operation requested to be performed will need to obtain and use information to the relevant user, so that the relevant user may autonomously select whether to provide information to software or hardware such as an electronic device, an application program, a server, or a storage medium that performs the operation of the technical solution of the present disclosure according to the prompt information.
As an alternative but non-limiting implementation manner, in response to receiving an active request from a relevant user, the prompt information may be sent to the relevant user, for example, in a popup window, where the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.
It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.
Before describing the technical scheme of the application, related concepts are described.
The data center can provide powerful computing and storage capacity through devices such as an integrated server, a storage device, a network switch, a router, a firewall and the like, and has wide application in the fields of big data analysis, high-performance computing, artificial intelligence and the like.
RDMA technology is a high performance network communication technology that allows one node in a network to directly access the memory of another node without intervention from the operating system. This technique can significantly reduce the delay of data transmission and the load of the CPU, thereby improving the efficiency of network communication.
RDMA network devices refer to network devices that support RDMA technology, which may typically include high-performance network interface cards (i.e., network cards), and which may have the capability to handle RDMA requests.
GPU DIRECT RDMA network architecture refers to a network architecture that combines GPU (Graphics Processing Unit ) accelerated computing and RDMA (Remote Direct Memory Access ) technologies. The network architecture can bypass the CPU (Central Processing Unit ) and host memory to directly transfer data between the GPU and the RDMA network device, so that the delay of data transfer can be greatly reduced. After the GPU DIRECT RDMA network architecture is used in the data center, the low-latency characteristic of the GPU DIRECT RDMA network architecture can meet the data transmission speed and efficiency requirements in the fields of big data analysis, high-performance computing, artificial intelligence and the like, so that the development of the technologies is promoted.
Specifically, in some embodiments, the data center GPU DIRECT RDMA network architecture may include a plurality of servers, network interconnect devices (such as network switches and routers) for connecting the servers, storage devices for providing data storage functions for the servers, operating systems installed in the servers and the network interconnect devices, CPUs provided in the servers, GPUs, network interface cards, and the like. The network interface card in the server can have the capability of processing RDMA requests, so that the GPU in the server can bypass an operating system and a CPU, and directly access the memory of another server through the network interface card and the network interconnection equipment in the server, thereby greatly improving the data transmission speed and efficiency between the servers.
For ease of understanding, referring to fig. 1 in combination, a schematic diagram of a server under a data center GPU DIRECT RDMA network architecture is provided for some technologies. In fig. 1, two servers A, B are illustratively included. For any one of the servers A, B, the server may include a central processor, baseboard management controller, switch controller 0, and a plurality of accelerator cards. For any accelerator card n, accelerator card n may include a switch controller n, a graphics processor n (i.e., GPU), and a network interface card n (i.e., network card). The central processor and the substrate controller may communicate to control the operation of the server. The cpu and the substrate controller may interact with the graphics processor n by the switching controller 0 and the switching controller n in the accelerator card n, respectively. The graphic processor n can be connected with a network switch through a network interface card n in the acceleration card n and can be used for data interaction with another server through the network switch besides the central processor and the baseboard management controller.
The switch controller 0 may be understood as a data routing device located in the server, and the switch controller n may be understood as a data routing device located in the accelerator card n. Data routing configurations may be performed in switch controller 0 and switch controller n for specifying the flow of data between devices in the server. For example, in the switching controller 1, data routing configuration may be performed, and it may be specified that data sent by the graphics processor 1 is forwarded to the network interface card 1, so that the graphics processor 1 may directly perform data interaction with another server through the network interface card 1 and the network switch.
In fig. 1, the graphics processor n in the accelerator card n bypasses the central processor and the operating system in the server, and directly performs data interaction with another server, so that the accelerator card n can have higher data transmission speed and lower data transmission delay, and therefore, in some technologies, the accelerator card n is similar to the data center GPU DIRECT RDMA network architecture shown in fig. 1, and has wide application in the fields of big data analysis, high-performance computing, artificial intelligence and the like. However, as a facility for processing a large amount of data information, once the network architecture of the data center GPU DIRECT RDMA shown in fig. 1 has a problem of network congestion, the application running in the data center is affected. Therefore, how to construct a high-performance data center GPU DIRECT RDMA network becomes a core technology that needs to be focused on.
In some techniques, based on the network architecture shown in fig. 1, a control method based on an ECN (Explicit Congestion Notification ) algorithm may be employed to alleviate the network congestion problem of the data center. Specifically, in these control methods, when the network interconnection device (such as a router) detects network congestion, the passing data packet is marked, that is, a corresponding mark is set in the ECN bit of the IP header or the TCP header. When these marked packets reach the data receiver, the data receiver acknowledges the congestion indication, and this acknowledgement is typically accomplished by the data receiver sending an acknowledgement signal (ACK) with the same ECN mark to the data sender. After receiving the acknowledgement signal with the ECN mark, the data transmitting end actively adjusts the data transmission rate (i.e. reduces the data transmission rate) to alleviate the problem of network congestion. For example, consider fig. 1. It is assumed that in some technologies, after the graphics processor 1 in the accelerator card 1 of the server a sends data packets to the network switch through the network interface card 1, if the network switch detects network congestion, the network switch marks the data packets, and sends the marked data packets to the server B. After receiving these marked data packets, server B may send an acknowledgement signal with ECN marking to graphics processor 1. The graphics processor 1, upon receiving the acknowledgement signal, may actively reduce the data transmission rate to alleviate the problem of data congestion between the network interface card 1 and the network switch.
In the above technology, depending on congestion feedback of the data receiving end to the data transmitting end, once the feedback of the data receiving end is delayed, the data transmitting end cannot respond in time, and the real-time performance of network congestion processing is affected.
To solve the above problems, the present disclosure first proposes a new network architecture of the GPU DIRECT RDMA in the data center. Referring to fig. 2 in combination, a schematic diagram of a server under a network architecture of a GPU DIRECT RDMA in a data center is provided according to one embodiment of the present disclosure. Fig. 2 is substantially similar to fig. 1, with the main difference that the server of the present disclosure includes a data transmission channel, a baseboard management controller, and a plurality of accelerator cards connected to a network switch and the data transmission channel.
In the present embodiment, the switching controller 0 in each server is taken as a data transfer channel. The accelerator cards may include data routing configurations therein, wherein for any accelerator card, the accelerator card sends data to the network switch through the local network interface card when the data routing configuration in the accelerator card is directed to a first data routing configuration that is local, and the accelerator card sends data to the target accelerator card through the data transmission channel when the data routing configuration in the accelerator card is directed to a second data routing configuration of the other target accelerator card, and sends data to the network switch through the target accelerator card. For example, assuming that the data routing configuration in the accelerator card 1 is the first data routing configuration, the accelerator card 1 transmits the data in the graphics processor 1 to the network switch through the local network interface card 1. For another example, assuming that the data routing configuration in accelerator card 2 is the second data routing configuration directed to accelerator card 1, accelerator card 2 sends data in graphics processor 1 to accelerator card 1 through switch controller 0 (i.e., the data transfer channel) and sends data to the network switch through accelerator card 1.
The baseboard management controller may be configured to connect the network switch and each accelerator card, and to modify the data routing configuration in the accelerator card, if specified conditions are met, the initial data routing configuration in each accelerator card being the first data routing configuration. Specifically, the data routing configuration in the accelerator card is stored in the switch controller of the accelerator card, and the baseboard management controller is used for connecting with the switch controller in the accelerator card to modify the data routing configuration in the accelerator card when the specified condition is satisfied.
Based on the server shown in fig. 2, the present disclosure first provides a communication control method, which can improve the real-time performance of network congestion processing. The communication control method can be applied to the baseboard management controller in the server shown in fig. 2. Referring to fig. 3 in combination, a flow chart of a communication control method according to an embodiment of the disclosure is provided. In fig. 3, the communication control method includes the steps of:
step S301, receiving a first congestion detection result sent by the network switch.
The first congestion detection result may be a congestion detection result of the network switch at a certain moment of time. The congestion detection result may be used to characterize whether a data congestion problem occurs in the network interface card n of each accelerator card n. In practical applications, the congestion detection result may be represented in various forms. For example, in some embodiments, the network switch may directly return the accelerator card identification of accelerator card n that has suffered from data congestion. In this way, the baseboard management controller can determine the accelerator card with data congestion and the accelerator card without data congestion based on the accelerator card identifier of each accelerator card n in the server and the accelerator card identifier returned by the network switch. For another example, in other embodiments, the network switch may return the amount of data in the network interface card n of each accelerator card n. The baseboard management controller can judge whether each accelerator card n has data congestion problem based on the received data quantity. The present disclosure is not limited to the specific form of congestion detection results.
The network switch may send the congestion detection result to the baseboard management controller once every a preset time period. Because of the dynamic variability of the network, congestion detection results may not be exactly the same at different times. The first congestion detection result may be a congestion detection result that the network switch sends to the baseboard management controller at any time.
The network switch may detect data congestion problems in the network interface card n of each accelerator card n based on conventional methods known to those skilled in the art. These conventional methods may include, but are not limited to, queuing management methods, traffic monitoring methods, port status monitoring methods, buffer management methods, ECN algorithm-based control methods.
Step S302, if it is determined that the first acceleration card with data congestion exists in the server according to the first congestion detection result, a second acceleration card without data congestion is searched.
As described in step S301, according to the first congestion detection result, it may be determined whether each acceleration card in the server has data congestion, and when there is data congestion of an acceleration card, it may be determined that the first acceleration card has data congestion and the second acceleration card has no data congestion.
Step S303, the first data routing configuration of the first accelerator card pointing to the local is modified to the second data routing configuration of the second accelerator card, so that the first accelerator card sends the data to be sent to the network switch to the second accelerator card based on the second data routing configuration, and the data is sent to the network switch through the second accelerator card.
Specifically, the data to be sent to the network switch in the first accelerator card may include first data that is newly generated by the graphics processor in the first accelerator card but has not yet been routed, and second data that has been routed in the first accelerator card but has been sent to the network switch according to the congestion problem. For the first data, the first data may be directly sent to the second accelerator card according to the modified second data routing configuration. For the second data, if the second data needs to be sent to the second accelerator card, the routing allocation can be performed on the second data again according to the modified second data routing configuration, so that the second data is sent to the second accelerator card.
In this embodiment, the first data may be sent to the network switch through the second accelerator card and the second data may be sent to the network switch through the first accelerator card. Thus, in the first aspect, by transferring the first data to the second accelerator card, the problem of data congestion in the first accelerator card can be greatly alleviated. On the other hand, the first data is continuously sent through the first accelerator card, so that network resources in the first accelerator card can be fully utilized, and the condition of network resource waste is reduced. Meanwhile, extra data processing capacity caused by re-route distribution of the first data can be avoided.
Of course, if it is determined that the data congestion does not occur in each accelerator card in the server according to the first congestion detection result, the data routing configuration in each accelerator card may be kept to be the initial first data routing configuration.
In summary, in the technical solutions of some embodiments of the present disclosure, after determining, based on the first congestion detection result sent by the network switch, a first accelerator card with data congestion in the server, data to be sent to the network switch may be sent to a second accelerator card without data congestion by modifying the data routing configuration in the first accelerator card, and the data may be sent to the network switch through the second accelerator card. In this way, the problem of data congestion in the first accelerator card can be relieved by transferring data, in the second aspect, based on the congestion detection result sent by the network switch, the operation of relieving the problem of data congestion can be started at the source of data sending without waiting for feedback of the data receiver, so that the problem that network congestion processing is not timely enough due to feedback delay of the data receiver can be avoided, namely, in the scheme of the disclosure, the instantaneity of network congestion processing can be improved, in the third aspect, the problem of network congestion is relieved by reducing the data transmission rate, which definitely reduces the data transmission performance of the data center. In the technical scheme of the disclosure, the data is transferred to the second acceleration card without data congestion for transmission, so that the data transmission rate can be kept unchanged, or even the data can be transmitted according to a higher data transmission rate, thereby being beneficial to improving the data transmission performance of the data center.
The communication control method of the present disclosure is further described below.
In some embodiments, in the case where there are a plurality of second accelerator cards, modifying the first data routing configuration of the first accelerator card that points to the local to the second data routing configuration that points to the second accelerator card in step S303 may include:
Modifying the first data routing configuration to a second data routing configuration directed to at least two second accelerator cards, such that the first accelerator card divides data to be sent to the network switch into a plurality of subsets based on the second data routing configuration, and sends data in different subsets to the network switch in parallel through the at least two directed second accelerator cards.
For example, assuming that there are three second accelerator cards A1, A2, A3 where no data congestion occurs, three first data routing configurations may be configured, namely, a first data routing configuration A1 directed to the second accelerator card A1, a first data routing configuration A2 directed to the second accelerator card A2, and a first data routing configuration A3 directed to the second accelerator card A3. The data may then be divided into three subsets a1, a2, a3. The data in subset A1 may be sent to the second accelerator card A1 according to the first data routing configuration A1, the data in subset A2 may be sent to the second accelerator card A2 according to the first data routing configuration A2, and the data in subset A3 may be sent to the second accelerator card A3 according to the first data routing configuration A3. Therefore, the data can be transmitted in parallel through a plurality of second acceleration cards without data congestion, and the data transmission rate is greatly improved. Meanwhile, network resources in each second accelerator card are fully utilized, and the problem of network resource waste is avoided.
In some embodiments, after modifying the first data routing configuration in the first accelerator card to the second data routing configuration, the method of the present disclosure may further comprise:
receiving a second congestion detection result sent by the network switch;
If it is determined that the data congestion problem of the first accelerator card is solved according to the second congestion detection result, the second data routing configuration in the first accelerator card can be modified into the first data routing configuration, so that the first accelerator card sends data to the network switch through the local network interface card based on the first data routing configuration.
Thus, on one hand, the data in the first accelerator card can be prevented from occupying network resources in the second accelerator card for a long time. On the other hand, the data can be directly sent to the network switch through the network interface card in the first accelerator card, so that the time consumed when the data is sent to the second accelerator card can be reduced, and the data transmission rate is improved.
In some embodiments, after receiving the second congestion detection result, the method of the present disclosure further includes:
If the first acceleration card and the second acceleration card are determined to have data congestion according to the second congestion detection result, searching a third acceleration card without data congestion in the server;
And modifying the second data routing configuration in the first accelerator card and the third data routing configuration in the second accelerator card, which points to the local, into a fourth data routing configuration, which points to the third accelerator card, so that the first accelerator card and the second accelerator card send data to the network switch through the third accelerator card based on the fourth data routing configuration.
Specifically, the second accelerator card may have data congestion, which may be caused by excessive data transmission pressure of the second accelerator card after data is distributed into the second accelerator card, or may be caused by data output by a graphics processor in the second accelerator card. However, for whatever reason, in the event of data congestion in the second accelerator card, this means that it is no longer suitable to send data in the first accelerator card into the second accelerator card, and therefore the second data routing configuration in the first accelerator card can be modified to a fourth data routing configuration directed to the third accelerator card. Therefore, the data in the first accelerator card can be sent to the third accelerator card, and the problem of data congestion in the second accelerator card is relieved. Of course, it will be appreciated that, to further alleviate the problem of data congestion in the second accelerator card, the third data routing configuration directed locally in the second accelerator card may also be modified to a fourth data routing configuration directed to the third accelerator card. In this way, the second accelerator card can send at least part of local data to the third accelerator card for transmission, so that the problem of data congestion in the second accelerator card is further relieved.
Further, considering that the local data of the second accelerator card may include the data which is acquired from the first accelerator card and is not yet transmitted and the data output by the local graphics processor, and the data in the first accelerator card also has a long time delay, the second accelerator card may send the data which is acquired from the first accelerator card and is not yet transmitted to the third accelerator card, and then send the data output by the local graphics processor to the third accelerator card. In this way, the data in the first accelerator card is prevented from being delayed for a longer time.
In some embodiments, in the case where there are a plurality of third accelerator cards, modifying the second data routing configuration in the first accelerator card and the third data routing configuration in the second accelerator card that points to the local to the fourth data routing configuration that points to the third accelerator card may include:
Modifying the second data routing configuration in the first accelerator card to a fourth data routing configuration directed to a target third accelerator card;
modifying the third data routing configuration in the second accelerator card to a fourth data routing configuration directed to a non-target third accelerator card;
Wherein the target third accelerator card and the non-target third accelerator card are different third accelerator cards.
For example, assume that third accelerator cards B1, B2, B3, B4 are present. Then, the data routing configuration in the first accelerator card may be modified to point to the third accelerator card B1, B2, and the data routing configuration in the second accelerator card may be modified to point to the third accelerator card B3, B4. Therefore, the problem that the data transmission pressure of the corresponding third acceleration card is overlarge due to the fact that the first acceleration card and the second acceleration card send data to the same third acceleration card can be avoided.
In some embodiments, the first accelerator card includes a routing register for storing an accelerator card identification of the accelerator card currently pointed to in the data routing configuration of the first accelerator card;
The modifying the first data routing configuration of the first accelerator card pointing to the local to the second data routing configuration of the second accelerator card may include:
And modifying the accelerator card identification of the first accelerator card into the accelerator card identification of the second accelerator card in the routing register of the first accelerator card.
In particular, the routing registers may be located in the switch controller of the first accelerator card, and the number of routing registers may be one or more. If the number of the routing registers is plural, the second accelerator cards may be in one-to-one correspondence with the routing registers, and each routing register is used for configuring the accelerator card identifier of one of the second accelerator cards if the second data routing configuration pointing to the plural second accelerator cards needs to be configured. In the case of only one routing register, the routing register may be a multi-bit register (e.g., a 16-bit routing register), such that accelerator card identifications of a plurality of second accelerator cards may be configured in the same routing register.
To this end, the related description of the communication control method is completed.
Corresponding to the communication control method, the present disclosure also provides a communication method. The communication method can be applied to the first accelerator card in the server shown in fig. 2. Referring to fig. 4 in combination, a flow chart of a communication method according to an embodiment of the disclosure is provided. In fig. 4, the communication method includes the steps of:
Step S401, when generating target data to be sent to a network switch, obtaining a local data routing configuration.
In step S402, if the data routing configuration is the second data routing configuration pointing to the second accelerator card, the problem of data congestion is determined to occur locally, and the target data is sent to the second accelerator card, so that the target data is sent to the network switch through the second accelerator card.
The related principles of the communication method can be referred to as related descriptions of the communication control method, and are not described herein.
In the technical solutions of some embodiments of the present disclosure, after determining, based on a first congestion detection result sent by a network switch, a first accelerator card with data congestion in a server, by modifying a data routing configuration in the first accelerator card, target data to be sent to the network switch may be sent to a second accelerator card with no data congestion, and the target data may be sent to the network switch by the second accelerator card. In this way, the problem of data congestion in the first accelerator card can be relieved by transferring target data, in the second aspect, based on the congestion detection result sent by the network switch, the operation of relieving the problem of data congestion can be started at the source of data sending without waiting for the feedback of the data receiver, so that the problem that network congestion processing is not timely enough due to the feedback delay of the data receiver can be avoided, namely, in the scheme of the present disclosure, the instantaneity of the network congestion processing can be improved, in the third aspect, the problem of network congestion is relieved by reducing the data transmission rate, which definitely reduces the data transmission performance of the data center. In the technical scheme of the disclosure, the target data is transferred to the second acceleration card without data congestion for transmission, so that the data transmission rate can be kept unchanged, or even the target data can be transmitted according to a higher data transmission rate, thereby being beneficial to improving the data transmission performance of the data center.
In some embodiments, if the second data routing configuration is directed to at least two second accelerator cards, the method of the present disclosure may further comprise:
Dividing the target data into a plurality of subsets;
And transmitting the target data in different subsets to the network switch in parallel through the at least two pointed second accelerator cards.
Specifically, the technical solution in this embodiment may refer to the related description of the communication control method, which is not repeated herein.
In some embodiments, the sending the target data to the second accelerator card in step S402 may include:
Determining the sending priority of the target data;
and marking the priority of the target data according to the determined transmission priority, so that the second accelerator card determines the transmission time of the target data based on the marked transmission priority after receiving the target data.
In particular, the transmission priority may characterize the transmission urgency of the target data. The transmission priority may be proportional to the degree of transmission urgency. After receiving the target data, the second accelerator card may determine a transmission order between the target data and other data based on a transmission priority of the target data. For example, when the transmission priority of the target data is relatively high, the second accelerator card may transmit the target data to the network switch immediately after receiving the target data. Therefore, the real-time transmission of the target data can be ensured, and overlarge time delay is avoided. On the contrary, when the transmission priority of the target data is relatively low, the second accelerator card may, after receiving the target data, rank the target data behind other data, and after the transmission of the other data is completed, retransmit the target data. In this way, it can be ensured that data having a relatively high transmission urgency can be transmitted with priority.
Further, if the first accelerator card does not label the sending priority of the target data, the second accelerator card may send the target data in the first accelerator card after the second accelerator card receives the target data and the local graphics processor output data is sent. In other words, the transmission priority corresponding to the data in the second accelerator card may be higher than the transmission priority corresponding to the data in the first accelerator card.
In practical application, a proper scheme can be selected according to practical requirements to determine the data transmission priority in the first accelerator card and the second accelerator card.
In some embodiments, the methods of the present disclosure may further comprise:
if the data routing configuration in step S401 is the first data routing configuration pointing to the local, it may be determined that no problem of data congestion occurs locally, and the target data is sent to the network switch through the local network interface card.
In some embodiments, the first accelerator card includes a routing register for storing an accelerator card identification of the accelerator card currently pointed to in the data routing configuration of the first accelerator card;
the obtaining the local data routing configuration may include:
acquiring a currently stored target acceleration card identifier from a routing register of a first acceleration card;
if the target acceleration card identifier is the acceleration card identifier of the second acceleration card, determining that the data routing configuration is a second data routing configuration pointing to the second acceleration card;
and if the target accelerator card identifier is the accelerator card identifier of the first accelerator card, determining that the data routing configuration is a first data routing configuration pointing to the local.
The specific principles of the above embodiments may be referred to the related description of the communication control method, and are not repeated herein.
In some embodiments, upon sending the target data to the second accelerator card, the method of the present disclosure may further comprise:
and if the local congestion data which is generated before the target data and needs to be sent to the network switch is provided, sending at least part of the congestion data to a second accelerator card so as to send the congestion data to the network switch through the second accelerator card.
In this way, the data transmission pressure in the first accelerator card is further reduced.
Corresponding to the communication control method, the present disclosure also provides a communication control device. Referring to fig. 5 in combination, a schematic block diagram of a communication control device according to an embodiment of the disclosure is provided. In fig. 5, the communication control apparatus includes:
a detection result receiving module 501, configured to receive a first congestion detection result sent by a network switch;
The acceleration card state identifying module 502 is configured to, if it is determined that a first acceleration card with data congestion exists in the server according to the first congestion detection result, find a second acceleration card with no data congestion;
The routing configuration modification module 503 is configured to modify a first data routing configuration of the first accelerator card that points to a local location to a second data routing configuration of the second accelerator card, so that the first accelerator card sends data to be sent to the network switch to the second accelerator card based on the second data routing configuration, and sends the data to the network switch through the second accelerator card.
In some embodiments, in the case where there are a plurality of second accelerator cards, the routing configuration modification module 503 is specifically configured to:
Modifying the first data routing configuration to a second data routing configuration directed to at least two second accelerator cards, such that the first accelerator card divides data to be sent to the network switch into a plurality of subsets based on the second data routing configuration, and sends data in different subsets to the network switch in parallel through the at least two directed second accelerator cards.
In some embodiments, after modifying the first data routing configuration in the first accelerator card to the second data routing configuration, the routing configuration modification module 503 is further configured to:
receiving a second congestion detection result sent by the network switch;
And if the data congestion problem of the first acceleration card is solved according to the second congestion detection result, modifying the second data routing configuration in the first acceleration card into the first data routing configuration so that the first acceleration card sends data to the network switch through the local network interface card based on the first data routing configuration.
In some embodiments, after receiving the second congestion detection result, the detection result receiving module 501 is further configured to:
If the first acceleration card and the second acceleration card are determined to have data congestion according to the second congestion detection result, searching a third acceleration card without data congestion in the server;
And modifying the second data routing configuration in the first accelerator card and the third data routing configuration in the second accelerator card, which points to the local, into a fourth data routing configuration, which points to the third accelerator card, so that the first accelerator card and the second accelerator card send data to the network switch through the third accelerator card based on the fourth data routing configuration.
In some embodiments, in the case where there are a plurality of third accelerator cards, the routing configuration modification module 503 is specifically configured to:
Modifying the second data routing configuration in the first accelerator card to a fourth data routing configuration directed to a target third accelerator card;
modifying the third data routing configuration in the second accelerator card to a fourth data routing configuration directed to a non-target third accelerator card;
Wherein the target third accelerator card and the non-target third accelerator card are different third accelerator cards.
In some embodiments, if it is determined that the data congestion does not occur on each accelerator card in the server according to the first congestion detection result, the routing configuration modification module 503 is specifically configured to:
the data routing configuration in each accelerator card is maintained as the initial first data routing configuration.
Corresponding to the communication method, the disclosure also provides a communication device. Referring to fig. 6 in combination, a schematic block diagram of a communication device according to an embodiment of the disclosure is provided. In fig. 6, the communication apparatus includes:
A route configuration obtaining module 601, configured to obtain a local data route configuration when generating target data to be sent to a network switch;
the data sending module 602 is configured to determine that a problem of data congestion occurs locally if the data routing configuration is a second data routing configuration pointing to a second accelerator card, and send the target data to the second accelerator card, so as to send the target data to the network switch through the second accelerator card.
In some embodiments, if the second data routing configuration is directed to at least two second accelerator cards, the data transmission module 602 is further configured to:
Dividing the target data into a plurality of subsets;
And transmitting the target data in different subsets to the network switch in parallel through the at least two pointed second accelerator cards.
In some embodiments, the data transmission module 602 is specifically configured to:
Determining the sending priority of the target data;
and marking the priority of the target data according to the determined transmission priority, so that the second accelerator card determines the transmission time of the target data based on the marked transmission priority after receiving the target data.
In some embodiments, the data transmission module 602 is further configured to:
if the data route configuration is the first data route configuration pointing to the local, determining that the problem of data congestion does not occur locally, and sending target data to the network switch through the local network interface card.
In some embodiments, when transmitting the target data to the second accelerator card, the data transmission module 602 is further configured to:
and if the local congestion data which is generated before the target data and needs to be sent to the network switch is provided, sending at least part of the congestion data to a second accelerator card so as to send the congestion data to the network switch through the second accelerator card.
The communication control device in this embodiment is presented in the form of a functional unit, where the unit refers to an ASIC (Application SPECIFIC INTEGRATED Circuit) Circuit, a processor and a memory that execute one or more software or firmware programs, and/or other devices that can provide the above functions.
The communication control device of the present disclosure has the same advantageous effects as the above-described communication control method and communication method, and is not described here in detail.
Referring to fig. 7 in combination, a schematic structural diagram of an electronic device according to some embodiments of the present disclosure is provided. As shown in fig. 7, the electronic device includes one or more processors 10, a memory 20, and interfaces for connecting components, including a high-speed interface and a low-speed interface. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 7.
The processor 10 may be a first PCIe device, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.
Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform the methods shown in implementing the above embodiments.
The memory 20 may include a storage program area that may store an operating system, application programs required for at least one function, and a storage data area that may store data created according to the use of the electronic device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The memory 20 may comprise volatile memory, such as random access memory, or nonvolatile memory, such as flash memory, hard disk or solid state disk, or the memory 20 may comprise a combination of the above types of memory.
The electronic device also includes a communication interface 30 for the electronic device to communicate with other devices or communication networks.
The presently disclosed embodiments also provide a computer readable storage medium, and the methods described above according to the presently disclosed embodiments may be implemented in hardware, firmware, or as recordable storage medium, or as computer code downloaded over a network that is originally stored in a remote storage medium or a non-transitory machine-readable storage medium and is to be stored in a local storage medium, such that the methods described herein may be stored on such software processes on a storage medium using a general purpose computer, special purpose processor, or programmable or dedicated hardware. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random-access memory, a flash memory, a hard disk, a solid state disk, or the like, and further, the storage medium may further include a combination of the above types of memories. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.
Portions of the present disclosure may be applied as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present disclosure by way of operation of the computer. Those skilled in the art will appreciate that the existence of computer program instructions in a computer-readable medium includes, but is not limited to, source files, executable files, installation package files, and the like, and accordingly, the manner in which computer program instructions are executed by a computer includes, but is not limited to, the computer directly executing the instructions, or the computer compiling the instructions and then executing the corresponding compiled programs, or the computer reading and executing the instructions, or the computer reading and installing the instructions and then executing the corresponding installed programs. Herein, a computer-readable medium may be any available computer-readable storage medium or communication medium that can be accessed by a computer.
Although embodiments of the present disclosure have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the disclosure, and such modifications and variations are within the scope defined by the appended claims.
Claims (16)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411735904.9A CN119232657B (en) | 2024-11-29 | 2024-11-29 | Server, communication control method, communication method, device, medium, and product |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411735904.9A CN119232657B (en) | 2024-11-29 | 2024-11-29 | Server, communication control method, communication method, device, medium, and product |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN119232657A CN119232657A (en) | 2024-12-31 |
| CN119232657B true CN119232657B (en) | 2025-02-28 |
Family
ID=93943547
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411735904.9A Active CN119232657B (en) | 2024-11-29 | 2024-11-29 | Server, communication control method, communication method, device, medium, and product |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119232657B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119906687B (en) * | 2025-03-31 | 2025-06-20 | 浪潮电子信息产业股份有限公司 | Server and equipment monitoring system and method thereof |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112653634A (en) * | 2020-12-10 | 2021-04-13 | 苏州浪潮智能科技有限公司 | Flow control method, device, equipment and readable storage medium |
| CN118381763A (en) * | 2024-05-14 | 2024-07-23 | 新华三技术有限公司 | Congestion control method and device |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11394649B2 (en) * | 2018-06-29 | 2022-07-19 | Intel Corporation | Non-random flowlet-based routing |
| US10795840B2 (en) * | 2018-11-12 | 2020-10-06 | At&T Intellectual Property I, L.P. | Persistent kernel for graphics processing unit direct memory access network packet processing |
| US20220124035A1 (en) * | 2021-05-05 | 2022-04-21 | Intel Corporation | Switch-originated congestion messages |
| CN117176666A (en) * | 2023-09-27 | 2023-12-05 | 苏州元脑智能科技有限公司 | Network flow control method, device, switch, electronic equipment and storage medium |
| CN118939391A (en) * | 2024-07-10 | 2024-11-12 | 武汉元石智算科技有限公司 | Automatic model parallel scheduling strategy generation method and device based on heterogeneous computing power |
-
2024
- 2024-11-29 CN CN202411735904.9A patent/CN119232657B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112653634A (en) * | 2020-12-10 | 2021-04-13 | 苏州浪潮智能科技有限公司 | Flow control method, device, equipment and readable storage medium |
| CN118381763A (en) * | 2024-05-14 | 2024-07-23 | 新华三技术有限公司 | Congestion control method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119232657A (en) | 2024-12-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11575609B2 (en) | Techniques for congestion management in a network | |
| US12119958B2 (en) | Cross network bridging | |
| US20220224614A1 (en) | Technologies for capturing processing resource metrics as a function of time | |
| US20180210752A1 (en) | Accelerator virtualization method and apparatus, and centralized resource manager | |
| US9866479B2 (en) | Technologies for concurrency of cuckoo hashing flow lookup | |
| EP3291089B1 (en) | Data processing method and apparatus | |
| CN119232657B (en) | Server, communication control method, communication method, device, medium, and product | |
| US20120226733A1 (en) | Method for distributing and controlling traffic in cloud computing system and cloud computing system using the same | |
| US11799827B2 (en) | Intelligently routing a response packet along a same connection as a request packet | |
| US20120144063A1 (en) | Technique for managing traffic at a router | |
| CN115190062B (en) | Service processing method and device, electronic equipment and computer readable storage medium | |
| CN113347017B (en) | Network communication method and device, network node equipment and hybrid network | |
| CN104252416A (en) | Accelerator and data processing method | |
| CN118433110A (en) | Data processing method, device, equipment and storage medium | |
| US20200329090A1 (en) | Method and network node for handling sctp packets | |
| CN119201416A (en) | Multi-job distributed training system and method | |
| CN114490458B (en) | Data transmission method, chip, server and storage medium | |
| US11824752B2 (en) | Port-to-port network routing using a storage device | |
| CN111240845B (en) | Data processing method, device and storage medium | |
| CN111726372B (en) | Thermal migration method, device, equipment and storage medium | |
| CN116802620A (en) | Apparatus and method for remote direct memory access | |
| US20250126189A1 (en) | Packet load balancer | |
| KR20190064290A (en) | Method and Apparatus for acceleration of data sending and receiving based on network interface card | |
| US11849005B2 (en) | Method and apparatus for accelerating network transmission in memory-disaggregated environment | |
| CN110166373B (en) | Method, device, medium and system for sending data from source physical machine to destination physical machine |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |