[go: up one dir, main page]

CN119174155A - Adaptive Traffic Arbitration Engine - Google Patents

Adaptive Traffic Arbitration Engine Download PDF

Info

Publication number
CN119174155A
CN119174155A CN202280095939.3A CN202280095939A CN119174155A CN 119174155 A CN119174155 A CN 119174155A CN 202280095939 A CN202280095939 A CN 202280095939A CN 119174155 A CN119174155 A CN 119174155A
Authority
CN
China
Prior art keywords
frame
bits
hardware device
instruction
arbitration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280095939.3A
Other languages
Chinese (zh)
Inventor
弗朗西斯科·冯斯·卢伊斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN119174155A publication Critical patent/CN119174155A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3018Input queuing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/50Overload detection or protection within a single switching element

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to frame arbitration in a network entity, e.g. for data forwarding, switching, routing or gateways. A hardware device for the frame arbitration, a frame having a frame format suitable for the frame arbitration, and a method for the frame arbitration are presented. The hardware device includes N >1 ingress ports for receiving N frames, M >1 egress ports for outputting M frames, and processing circuitry coupled to the ingress ports and the output ports. Each frame includes a frame header including a frame priority field including a set of bits including two or more subsets of bits, each subset of bits indicating arbitration metadata. The processing circuit is configured to determine, based on the frame priority field, which of the first frame and the second frame is to be processed and output at the output port in a next clock cycle when the first frame and the second frame are received simultaneously.

Description

Adaptive traffic arbitration engine
Technical Field
The present invention relates to network traffic arbitration in a network entity, wherein network traffic is handled in units of frames. The network entity may be used for data forwarding, switching, routing or gateways in a communication network. The invention proposes a hardware device for frame arbitration in a network entity. Furthermore, the invention proposes a frame having a frame format suitable for frame arbitration and a method for performing frame arbitration.
Background
Conventional switching or gateway schemes that can be found in network entities such as gateways used in the automotive industry are typically based on software-centric implementations. In such a software-centric approach, the frame arbitration policy for forwarding frames from the plurality of ingress ports to the plurality of egress ports of the network entity is determined by a routing algorithm that may be executed by a core processing unit (core processing unit, CPU) of, for example, a microcontroller unit (microcontroller unit, MCU) or a system on chip (SoC) device.
Arbitration policies can become complex in terms of timing requirements, particularly in network entities having a significant number of high-speed ingress and egress ports that must handle large amounts of network traffic, where at least a portion of the network traffic is critical in terms of functional safety and/or time deterministic requirements. This is the case, for example, for future 100Mbps, 1000Mbps, or even multi-gigabit automotive ethernet.
Software-centric approaches may lead to data throughput bottlenecks and design challenges that are difficult to overcome. Furthermore, software-centric approaches may not scale well due to their complexity.
Disclosure of Invention
In view of the above, the present invention aims to provide an improved scheme for arbitrating frames in a network entity. The goals to be achieved include that the frame arbitration scheme involves reduced complexity at Cheng Fangmian (which is scalable) and provides similar or better performance than conventional schemes (e.g., software-centric schemes as described above).
These and other objects are achieved by the solutions described in the independent claims. Advantageous implementations are further described in the dependent claims.
A first aspect of the invention provides a hardware device for frame arbitration in a network entity, the hardware device comprising N ingress ports for receiving N frames, wherein N >1;M egress ports for outputting M frames, wherein M >1, processing circuitry connected to the ingress ports and the output ports, wherein each frame comprises a frame header comprising a frame priority field, wherein the frame priority field comprises a set of bits, and wherein the set of bits comprises two or more subsets of bits, each subset of bits indicating respective arbitration metadata, and wherein the processing circuitry is configured to determine which of the first frame and the second frame is processed and output sequentially or concurrently at the egress ports in a next clock cycle, based on the set of bits in the frame priority field of the first frame and the second frame, respectively, when at least a first frame and a second frame are received simultaneously at the ingress ports.
According to the first aspect, the present invention proposes a hardware-centric approach to frame arbitration in a network entity instead of a software-centric approach. The hardware-centric approach may reduce the level of complexity in programming. The scheme of the present invention also has better scalability and can achieve better performance levels (e.g., in terms of key performance indicators (key performance indicator, KPI) including frame delay and jitter or in terms of data throughput through the network entity) than conventional software-centric schemes. The hardware-centric approach of the present invention may automate all routing and forwarding processes that are allocated to the network entities at different processing stages.
For example, the network entity may be an automotive gateway in a vehicle and the hardware device may be comprised in a gateway controller. But the network entity may also be a switch or a router or the like.
The hardware device of the first aspect may provide a fully autonomous frame arbitration scheme that operates exclusively in hardware without software intervention at all. It should be noted that the software in the present invention is understood to be executed by a processor such as a CPU by means of machine instructions described by source code. In the present invention, the hardware device is responsible for arbitration of frames from the ingress port to the egress port without such machine instructions.
Each frame may require different types of processing tasks, and the hardware device may be capable of performing these different types of processing tasks in parallel on the frame. Frame arbitration may include assigning the frames received at the ingress ports to those processing tasks as needed. The processing tasks may be performed by one or more processing resources of the hardware device, such as by one or more hardware accelerators. Any frame may also be associated with two or more processing tasks that must be completed sequentially, e.g., according to a particular processing order. In this case, the frame may be processed by executing processing tasks one by one in a certain processing order.
For example, the frame arbitration may be performed by the hardware device if the first frame and the second frame require the same processing task or tasks, and for example, the processing tasks must be performed by the same processing resource or resources. In this case, one of the first frame and the second frame may appear before the other frame, and the other frame may wait, for example, in a queue, until the processing resource is idle again after processing the first frame, depending on the determination of the processing circuitry of the hardware device. In this way, the frame arbitration may take effect and the processing circuitry may decide which of two frames competing for the same processing resource or resources has the higher priority.
In an implementation form of the first aspect, the processing circuitry is configured to update the set of bits in the frame priority field of each frame before the frames are processed and output at the output port.
Thus, for example, another hardware device also designated for processing the one or more frames (output by the hardware device of the first aspect) as described in the first aspect may use the one or more updated frame priority fields for further frame arbitration. For example, each of the plurality of processing stages of the network device may comprise a hardware device according to the first aspect, and each hardware device may perform frame arbitration on its associated processing stage according to a frame priority field in a frame it receives at its ingress port.
In an implementation of the first aspect, the set of bits in a frame priority field of the frame is updated based on information contained within the frame and/or based on available information of the hardware device itself and a network environment.
The available information may include at least one of a state of the network and a state of the hardware device.
In an implementation of the first aspect, the set of bits in the frame priority field of the frame is updated based on at least one of a status of a queue receiving the frame, a timeout and/or timestamp of the frame, a list of one or more network processing tasks to be performed on the frame, a status of one or more hardware accelerators, wherein one or more network processing tasks have to be performed on the frame before the frame is output, a status of a time sensitive network shaper of the frame to be output, an in-band telemetry status, a virtual local area network tag priority of the frame.
In an implementation form of the first aspect, each frame comprises an instruction frame and a data frame, wherein the frame header comprising the frame priority field belongs to an instruction frame.
In an implementation manner of the first aspect, the hardware device is configured to receive and output each instruction frame in a control plane, and receive and output each data frame on a data plane.
In an implementation form of the first aspect, the processing circuit is configured to separate each frame received at the ingress port into an instruction frame and a data frame, wherein the instruction frame is provided with the frame header comprising the frame priority field.
According to the above-described implementation describing the instruction frame and the data frame, respectively, the solution of the invention supports a software defined network (software defined networking, SDN) method, which means separation into the data plane and the control plane.
In an implementation form of the first aspect, the processing circuit is further configured to process the data frame of each frame according to an instruction included in the frame header and a payload of the instruction frame of the frame before outputting the frame at the output port.
The instructions included in the instruction frame may be processed by the hardware device to control processing of the data frame. For example, the processing circuitry of the hardware device may automatically process the data frame according to instruction bits of the instructions included in the instruction frame without any software interaction. Thus, the processing circuitry may also arbitrate processing of different data frames according to the set of bits in the respective frame priority information fields in the instruction frame associated with the different data frames.
In an implementation of the first aspect, the processing circuit is configured to associate a weight with the subset of bits in the frame priority field of each frame, and to determine which of the first frame and the second frame to output at the output port in the next clock cycle further based on the weights associated with the subset of bits in the frame priority fields of the first frame and the second frame, respectively, and by executing a sorting algorithm.
In this way, the priority of the frames may be performed and/or changed by the hardware device.
In an implementation of the first aspect, the processing circuit comprises a set of configurable multiplexers, wherein different configurations of the set of configurable multiplexers correspond to different weights associated with the subset of bits in the frame priority field of each frame.
For example, based on the weighted subset of bits in the frame priority fields of the first and second frames, the multiplexer may automatically determine which of the first and second frames to process and output first at the egress port in the next clock cycle.
In an implementation of the first aspect, the N ingress ports are connected to N or more parallel queues for providing the N ingress frames in parallel from the N ingress ports, and/or the M egress ports are connected to M or more parallel queues for receiving the M egress frames in parallel and transmitting them in parallel to the M egress ports.
Frames that have to wait for processing due to the frame arbitration may remain in the queue. For example, the hardware device may receive parallel frames at an ingress port, may provide them to the parallel queue, and may select them one by one from the parallel queue for processing according to the frame priority field. For example, the hardware device may provide processed frames into a parallel queue, and may further output the frames from the queue at the egress port.
A second aspect of the present invention provides an instruction frame having a frame format suitable for frame arbitration in a network entity, the frame format comprising a frame header, wherein the frame header comprises a frame priority field, wherein the frame priority field comprises a set of bits, wherein the set of bits comprises two or more subsets of bits, and wherein each subset of bits is indicative of respective arbitration metadata.
The instruction frame of the second aspect, in particular the frame priority field, implements the hardware-centric approach of the present invention for the frame arbitration, as implemented by the hardware device of the first aspect. Thus, the frame format of the instruction frame yields the advantages described above for the hardware device.
In an implementation of the second aspect, the respective arbitration metadata includes one or more bits indicating a priority of the instruction frame.
Thus, in case the same processing resources have to be used for the two or more frames, for example, the processing circuitry of the hardware device by the first aspect processes the two or more frames one by one according to their priorities.
In an implementation of the second aspect, the respective arbitration metadata further comprises at least one of one or more bits indicating a queue status of a queue providing the frame, one or more bits indicating a timeout or timestamp of the frame, one or more bits indicating a list of one or more processing tasks to be performed on the frame, one or more bits indicating a status of one or more hardware accelerators, wherein one or more network processing tasks have to be performed on the frame, one or more bits indicating a time sensitive network shaper of the frame, one or more bits indicating an in-band telemetry status, one or more bits indicating a virtual local area network tag priority of the frame.
In an implementation manner of the second aspect, the frame priority field is a first field in the frame header.
This facilitates a fast response and thus processing efficiency of the hardware device processing the frame, e.g. of the first aspect.
In an implementation of the second aspect, the frame format of the frame is based on one of a controller area network (controller area network, CAN), CAN flexible data rate, CAN XL, local interconnect network, flexRay, media oriented system transport, ethernet, mobile industry processor interface or camera serial interface 2.
Thus, the instruction frame of the second aspect is compatible with a plurality of established network protocols.
In an implementation form of the second aspect, the frame format of the frame is a standardized frame format comprising a plurality of fields, wherein each field is parameterized by a field index or offset parameter and a field size parameter.
This enables a network entity (e.g., an automotive gateway) to efficiently handle the various network protocols used together.
In an implementation of the second aspect, the instruction frame further comprises a frame header and a payload, wherein the frame header and the payload comprise instructions indicating how a data frame associated with the instruction frame is to be processed by a processing circuit of a hardware device.
A third aspect of the invention provides a method for frame arbitration in a network entity, the method being performed by a hardware device and comprising receiving at least a first frame and a second frame simultaneously at a plurality of ingress ports of the hardware device, wherein each frame comprises a frame header comprising a frame priority field, wherein the frame priority field comprises a set of bits, and wherein the set of bits comprises two or more subsets of bits, each subset of bits indicating respective arbitration metadata, and determining, by a processing circuit of the hardware device, which of the first frame and the second frame is to be processed and output sequentially or concurrently at a plurality of egress ports of the hardware device in a next clock cycle based on the set of bits in the frame priority field of the first frame and the second frame, respectively.
In an implementation of the third aspect, the processing circuit updates the set of bits in the frame priority field of each frame before the frame is processed and output at the output port.
In an implementation of the third aspect, the set of bits in the frame priority field of the frame is updated based on information contained within the frame and/or based on available information of the hardware device itself and a network environment.
In an implementation of the third aspect, the set of bits in the frame priority field of the frame is updated based on at least one of a status of a queue receiving the frame, a timeout and/or timestamp of the frame, a list of one or more network processing tasks to be performed on the frame, a status of one or more hardware accelerators, wherein one or more network processing tasks have to be performed on the frame before the frame is output, a status of a time sensitive network shaper of the frame to be output, an in-band telemetry status, a virtual local area network tag priority of the frame.
In an implementation manner of the third aspect, each frame includes an instruction frame and a data frame, wherein the frame header including the frame priority field belongs to an instruction frame.
In an implementation of the third aspect, the hardware device receives and outputs each instruction frame in a control plane and each data frame on a data plane.
In an implementation of the third aspect, the processing circuit separates each frame received at the ingress port into an instruction frame and a data frame, wherein the instruction frame is provided with the frame header including the frame priority field.
In an implementation of the third aspect, the processing circuit processes the data frame of each frame according to an instruction included in the frame header of the frame and a payload of the instruction frame before outputting the frame at the output port.
In an implementation of the third aspect, the processing circuit associates a weight with the subset of bits in the frame priority field of each frame, and further determines which of the first frame and the second frame to output at the output port in the next clock cycle based on the weights associated with the subset of bits in the frame priority fields of the first frame and the second frame, respectively, and by executing a sorting algorithm.
In an implementation of the third aspect, N or more parallel queues provide the N ingress frames in parallel from the N ingress ports, and/or M or more parallel queues receive the M egress frames in parallel and send them to the M egress ports in parallel.
The method of the third aspect provides the same advantages as described above for the hardware device of the first aspect.
The hardware-centric approach of the present invention may be applied to any type of network entity, e.g., switches, intelligent Network Interface Cards (NICs), routers, gateways. The hardware device enables frame arbitration within the network entity to achieve network traffic arbitration in an autonomous and automated manner. The hardware device's frame arbitration does not require software interactions (but software may be used to define the startup configuration). The fact that the hardware device does not need to be processed by the CPU executing the source code line improves performance and time certainty as well as other quality of service (quality of service, qoS) metrics.
The hardware devices may be synthesized in silicon, may conform to SDN network architecture at the device level, and may be distributed over the control plane and data plane of the network entity. The processing circuitry of the hardware device may perform the network traffic arbitration in a distributed manner at different processing stages of a network entity, such as processing stages for frame priority supervision, queuing or dequeuing, processing and/or shaping. Therefore, there is virtually no overhead in terms of delay, jitter, and resource consumption due to the optimized hardware architecture of the hardware device.
The hardware device may also adapt its behavior instantaneously, e.g., according to a predefined rule-based arbitration algorithm (e.g., polling) or in response to changing network environmental conditions (e.g., network congestion, internal queue overflow status, or expiration of a timeout per frame). This may be done at runtime without interrupting the operation of the network entity.
The hardware device may be controlled by the set of bits in the frame priority field of the frame, wherein the arbitration metadata (subset of bits) in the set of bits may be based on a large and diverse set of inputs, some of which are static and may be (re) configurable by a set of registers (e.g. memory map) of the hardware device, and other inputs are dynamically updated by the hardware device itself, e.g. on-the-fly, based on real-time conditions (e.g. traffic status, memory status, arbitration priority status, etc.) of the network and the device itself. The priority given to the bits of the set of bits may lead to a diversified and flexible set of operation modes for the hardware device.
It should be noted that the hardware device is also referred to as a "distributed arbitration engine (distributed arbitration engine, DAE)" in the present invention.
It should be noted that all the devices, elements, units and means described in the present application may be implemented as hardware elements. All steps performed by the various entities described in the present application and functions to be performed by the various entities described are intended to mean that the respective entities are adapted to perform the respective steps and functions. Even though in the following description of specific embodiments the specific functions or steps to be performed by external entities are not reflected in the description of specific detailed elements of the entity performing the specific steps or functions, it is obvious to a skilled person that these methods and functions may be implemented in respective hardware elements.
Drawings
The various aspects described above and the manner of attaining them will be elucidated with reference to the accompanying drawings, wherein:
Fig. 1 shows a hardware device for frame arbitration in a network entity according to the invention.
Fig. 2 shows an instruction frame according to the invention with a frame format suitable for frame arbitration.
Fig. 3 shows an exemplary architecture of a network entity, in particular an automotive gateway controller, in which a hardware device according to the invention is deployed.
Fig. 4 shows the decoupling of the control plane and the data plane according to the SDN method, which may be implemented by the hardware device of the present invention.
Fig. 5 shows decoupling of a control plane from a data plane across an exemplary network entity in which the hardware device of the present invention is deployed.
Fig. 6 shows an example of frame arbitration performed by the hardware device of the present invention based on the frame priority field.
Fig. 7 shows an exemplary architecture of a hardware device of the present invention, which may be used in an automotive gateway controller.
Fig. 8 shows the adjustment of the frame priority field by using a multiplexer.
Fig. 9 shows a method for frame arbitration in a network entity according to the invention.
Detailed Description
Fig. 1 shows a hardware device 100 according to the invention. The hardware device 100 is used to perform frame arbitration in a network entity. To this end, the hardware device 100 may be deployed in a network entity. For example, the hardware device may include the processing circuit 105, and where the processing circuit 105 may be installed in a network entity. For example, the processing circuitry 105 may be deployed in a distributed manner in a network entity. For example, the processing circuitry 105 may be included in one or more processing stages of a network entity. Each processing stage of the network entity may be operable to perform one or more processing tasks on one or more frames.
The processing circuitry 105 may be responsible for executing, performing, or initiating the various operations of the hardware device 100 described herein. The processing circuitry 105 may include analog circuitry, digital circuitry, or both analog and digital circuitry. The digital circuitry may include components such as an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a digital signal processor (DIGITAL SIGNAL processor, DSP), or a multi-purpose processor. Hardware device 100 may also include memory circuitry for storing information. For example, hardware device 100 may include one or more queues, such as one or more first-in-first-out (FIRST IN FIRST out, FIFO) memories for storing and queuing frames, for example, between processing stages.
In addition to the processing circuitry 105, the hardware device 100 also includes N input ports 101 for receiving N frames 102 and M output ports 103 for outputting M frames 102. Thus, both N and M are integers, N >1 and M >1. In fig. 1, for example, n=4 and m=3. N > M is typically possible, but N < M or n=m is also possible. As shown in fig. 1, processing circuitry 105 is connected to ingress port 101 and egress port 103 of hardware device 100.
Each frame 102 processed by hardware device 100 may be designed as shown in the upper portion of fig. 2. Specifically, each frame 102 includes a frame header 201 that includes a frame priority field 202, where the frame priority field 202 includes a set of bits. This set of bits includes two or more subsets of bits, wherein each subset of bits is indicative of corresponding arbitration metadata. Each frame 102 may optionally further include a payload 205 and/or a tail 206.
As further shown in fig. 2, frame 102 may be instruction frame 203. Or it may comprise instruction frame 203. In the first case, the optional lower part of fig. 2 is superfluous if frame 102 is instruction frame 203. In the second case, if frame 102 includes instruction frame 203, then as shown in the lower part of fig. 2 by the dashed box, frame 102 may include instruction frame 203 and, in addition, data frame 204. That is, frame 102 may be separated in instruction frame 203 and data frame 204. In this case, the frame header 201 including the frame priority field 202 belongs to the instruction frame 203, as shown.
If frame 102 includes instruction frame 203 and data frame 204, then in addition to frame priority field 202 of frame header 201, frame header 201 and/or optional payload 205 of instruction frame 203 may include instructions. This instruction indicates how one or more processing stages, such as processing circuitry 105 of hardware device 100 or a network entity, are to process a frame of data 204, which in this case belongs to the same frame 102 as instruction frame 203.
In any case, the processing circuit 105 is configured to determine which of the first frame 102a and the second frame 102b is to be processed and output (sequentially or concurrently) at the output port 103 in the next clock cycle when at least the first frame 102a and the second frame 102b are received simultaneously at the input port 101 (where both are frames 102, e.g., as shown in fig. 2). That is, the processing and output order between the first frame 102a and the second frame 102b is determined. This determination is performed by the processing circuit 105 based on the sets of bits in the frame priority fields 202 of the first frame 102a and the second frame 102b, respectively. The processing circuit 105 may process the set of bits in the frame priority field 202 of the first frame 102a or the set of bits in the frame priority field 202 of the second frame 102 b. Based on this, it may determine the higher priority frame of the two frames 102a, 102b, and may first process and output this higher priority frame at the egress port 103.
Each set of bits may be regarded as a control word that is used to determine the priority of the frame 102 to which it belongs. The processing and output order between the first frame 102a and the second frame 102b may automatically occur in the processing circuit 105 according to this control word. Different types of arbitration metadata may be encoded by a subset of bits in the frame priority field 202. The determination steps of processing circuitry 105 do not require software or a processor that operates based on software or machine instructions. By performing the determining step, at least frame arbitration between the first frame 102a and the second frame 102b is achieved. Of course, the determining step may be done based on any two frames 102 received at the ingress port 101 simultaneously, e.g. from two or more parallel queues connected to the ingress port 101, and may equally be performed for more than two frames 102 received in parallel. Thus, hardware device 100 is suitable for frame arbitration, typically network traffic arbitration.
Fig. 3 illustrates an example of a network entity 300 in which the hardware device 100 may be implemented. In this example, the network entity 300 is an automotive gateway comprising the hardware device 100. In particular, the hardware device 100 may be implemented in a gateway controller of a gateway. It should be noted that the network entity 300 may also be of different types, for example, the network entity 300 may be an intelligent Network Interface Card (NIC), a switch or a router.
As also shown in fig. 3, the network entity 300 may include a plurality of processing stages. For example, in an exemplary case, the network entity 300 may include four processing stages. These processing stages are referred to in fig. 3 as "frame normalizer", "filter and policing", "gateway (signal and PDU)" and "traffic shaping", respectively. Each processing stage of the network entity 300 may be configured to perform one or more different processing tasks for each frame 102. The frame 102 may follow a data path through the network entity 300 that directs the frame 102 through each of a plurality of processing stages.
Hardware device 100 is represented in fig. 3 as a DAE. Processing circuitry 105 of hardware device 100 may be located in each processing stage of network entity 300. The processing circuitry 105 of the hardware device 100 may be distributed over these processing stages.
In one example, as can be seen in fig. 3, each processing stage may include at least one DAE. Thus, one or more DAE of each respective processing stage may be considered to form a hardware device 100 as described above with respect to fig. 1. In this case, the input port of the corresponding processing stage may correspond to the input port 101 of the hardware device 100, and in this case, the output port of the corresponding processing stage may correspond to the output port 103 of the hardware device 100. Nonetheless, one or more DAE deployed in each of the plurality of processing stages of the network entity 300 may also be considered to form the distributed hardware device 100. In this case, the ingress port of the network entity 300 may correspond to the ingress port 101 of the hardware device 100, and the egress port of the network entity 300 may correspond to the egress port 103 of the hardware device 100. Such a distributed implementation of the processing circuit 105 across the processing stages of the network entity 300 may have only a negligible overhead in terms of hardware resources and practically no penalty for delay or jitter of the frames.
The interface in the network entity 300 between one processing stage and the next may be based on one or more queues of the network entity 300 (e.g., FIFO memory). Distributed arbitration performed by the processing circuitry 105 of one or more hardware devices may be applied across all frames 102, with the frames 102 moving through all processing stages of the network entity 300. Frame arbitration may be necessary when two frames 102 arrive at the processing stage at the same time and need to be processed with the same processing resources. Prior to the processing stage, frames 102 may be temporarily stored in queues, and processing circuitry 105 may determine which frames 102 stored in those queues are first processed in the processing stage with the required processing resources.
In general, all frames 102 provided in parallel and/or subsequently to an ingress port of the network entity 300 may flow through all processing stages of the network entity 300, may be processed by each processing stage, and may then be output in parallel and/or subsequently at an egress port of the network entity 300. Processing circuitry 105 of one or more hardware devices 100 may decide which frames 102 need to be processed by which processing resources of a processing stage at what time, wherein all of these decisions may be made based on priority criteria encoded into the respective frame priority fields 202 of frames 102. Optionally, these frame priority fields 202 may be updated in one or more or each processing stage of the network entity 300 after processing the frame 102, respectively. In this way, the priority criteria of the encoding may be changed or reconfigured.
As shown in fig. 4, the processing circuitry 105 of the hardware device 100, inspired by the SDN method, may be configured to separate each frame 102 it receives at its ingress port 101 into an instruction frame 203 and a data frame 204, as shown in fig. 2. For example, in fig. 3, each frame 102 input to a first processing stage ("frame normalizer") of a network entity 300 may be decomposed into an instruction frame 203 and a data frame 204 in this manner by processing circuitry 105 disposed in this processing stage. Instruction frame 203 may contain arbitration metadata in frame priority field 202 and may contain additional metadata, such as metadata of original frame 102. The data frame 204 may substantially contain the payload of the original frame 102.
The processing circuitry 105 of the hardware device 100 may also be used to already receive and output instruction frames 203 in the control plane and receive and output data frames 204 in the data plane. In this case, the processing circuit 105 may not have to perform the separation. That is, the processing circuit 105 may receive, process, and output the frame 102 (as shown in fig. 2) in a separate manner. For example, in fig. 3, after frame 102 is separated in a processing stage, processing circuitry 105 in one or more subsequent processing stages may process instruction frame 203, including frame priority field 202 and data frame 204, respectively, in a control plane and a data plane.
In other words, each frame 102 may first be regenerated into an instruction frame 203 and a data frame 204 in the network entity 300, and from this step, the instruction frame 203 may be provided with a frame priority field 202 describing the priority assigned to this frame 102 (in particular the data frame 204). The frame priority field 202 may be used to perform arbitration with other frames 102 in each processing stage of the network entity 300. For example, the processing circuitry 105 in each processing stage may determine which frame 102 of two or more frames stored in the queue prior to the processing stage may be selected first for processing in the processing stage. This frame arbitration is performed according to the priority levels of the parallel queued frames 102, which are determined by the set of bits in their respective frame priority fields 202. These priority levels are determined by arbitration metadata encoded into frame header 201.
Thus, the priority level of each frame 102 is encoded in the frame priority field 202. This field 202 may be the first field of the frame header 201, which is advantageous for efficiency (implementation and execution). In this case, the frame priority field 202 may be the first segment read by the processing circuit 105, so that frames are read without forcing them out of the queue, and an immediate response may be given (in only one clock cycle).
The priority level of any frame 200 may be a function of many variables of the arbitration metadata, including, for example, port/queue status (almost empty/almost full), network congestion status, time, frame intrinsic priority, TSN shaper rules, estimated processing time per task, type and number of available hardware accelerators available (e.g., indicated by the status of shared resource hardware accelerators), estimation of telemetry data per stream, and the like. All of these variables may be processed directly in hardware by processing circuitry 105 without the need for software.
By the method of the present invention, as shown in fig. 5, a fully SDN layered network entity 300 may be obtained, which may be driven by the arbitration metadata in the frame header 201 of the frame 102, in particular the arbitration metadata in the frame priority field 202. As can be seen in fig. 5, the format of the frame 102 may be based on a variety of different network protocols. For example, the protocol may be CAN, CAN flexible data rate, CAN XL, local interconnect network, flexRay, media oriented system transport, ethernet, mobile industry processor interface or camera serial interface 2. For example, the frame arbitration of the present invention performed in network entity 300 may be motivated by carrier-sense multiple ACCESS WITH collision detection (CSMA-CD MAC) with collision detection of the CAN network.
The arbitration metadata of the frame 102 may be transferred from one processing stage to the next in the network entity 300. Each processing stage may process the data frame 204 according to instructions in the instruction frame 203 and may process the instruction frame 203 to move and/or update the frame priority field 202 and optionally other metadata. This may be accomplished by, for example, the processing circuitry 105 of one or more hardware devices 100 deployed in a distributed fashion in a processing stage. Processing and optionally modifying or updating the arbitration metadata in the frame priority field 202 of each processed frame 102 may be done at runtime.
To provide the best possible arbitration policy within the network entity 300, each frame 102 may be provided with different arbitration metadata, which may be organized in one or more information fields of the frame priority field 202. In terms of arbitration policy, the following factors may participate in inline priority allocation through the frame priority field 202 of the frame 102. In the following example, the frame priority field 202 may be a 16-bit control word:
■ Highest priority [1 bit: interrupt ]
■ Queue status [1 bit (almost) full, (almost) empty ]
■ Timeout or timestamp [4 bits: time factor ]
■ The state of the processing task or the hardware accelerator [2 bits: unused, unused in use, in use to unused, in use ]
■ Shaper status [2 bits unused, unused in use, in use to unused, in use ]
■ VLAN tag priority (3 bits: class)
■ In-band telemetry status [3 bits: counter ]
As shown in fig. 6, for example, selecting a frame 102 from two or more queued frames 102 in two or more parallel queues (FIFOs in fig. 6), which may be disposed between successive processing stages of the network entity 300, may be done in only one clock cycle based on the set and subset of bits (arbitration metadata) allocated within the frame priority field 202 of the frame 102 (here the instruction frame 203). Frame selection may include an ultrafast ordering process. The selection and the order of the selection results may be performed in only one clock cycle due to the encoding of the arbitration metadata embedded in each instruction frame 203. Such arbitration policies may be more efficient than other software-based implementations.
As can be seen in fig. 6, the selection of the next frame to be processed 102 is done by the hardware device 100 based on instruction frames 203, and each instruction frame 203 can be used to determine how to process the corresponding data frame 204. Both instruction frame 203 and data frame 204 may be stored in parallel FIFOs. Processing circuitry 105 of hardware device 100 may control at least one multiplexer to process data frames 204 in an order determined based on instruction frames 203.
Fig. 7 shows an exemplary architecture of the processing circuitry 105 of the hardware device 100, fig. 7 being based on fig. 6, and shows that the processing circuitry 105 may arbitrate the frame 102 according to a frame priority field 202 in an instruction frame 203 of the frame 102, such that a corresponding data frame 204 of the frame 102 is processed by a plurality of processing tasks according to the frame arbitration. Each processing task may be implemented by a hardware accelerator and multiple multiplexers may be used to arbitrate the data frames 204 into the correct processing task defined by the instruction frame 203 and according to the frame arbitration order determined based on the instruction frame 203. FIFOs may be used to arrange data frames 204 requiring the same processing tasks (in particular the same hardware accelerator) before and after the multiplexer. FIFOs may also be used to store corresponding instruction frames 203.
The frame priority field 202 of any frame 102 may be adapted by runtime reconfiguration (e.g., on the fly). This may be accomplished by one or more hardware multiplexers 800, as shown in FIG. 8. For example, hardware device 100 may include a set of configurable multiplexers 800. Hardware device 100 may also be used to associate weights with the subset of bits in frame priority field 202 of any frame 102. In this case, different configurations of the set of configurable multiplexers 800 may correspond to different weights associated with the subset of bits in the frame priority field 202 of each frame 102. As can be seen in fig. 8, multiplexing may reorder the weights associated with the subset of bits in the frame priority field 202. The frame priority field 202 entering the multiplexer 800 may be different from the frame priority field 202 from the multiplexer 800. Accordingly, the frame priority field 202 is adjusted or reconfigured. Self-reconfiguration (by hardware device 100) and (re) configuration may be performed on-the-fly, e.g., by a host CPU, without interrupting the operation of network entity 300. This powerful function provides the possibility for different modes of operation. The relative priority given to each frame 102 may be configurable accordingly and may be reconfigured at run-time. For example, the composition of each bit of the frame priority field 202 of each frame 102 may be organized or reorganized such that higher weights are prioritized (small end mode), e.g., MSB to LSB.
Frame arbitration performed based on processing of the frame priority field 202 by the processing circuit 105 may be based on ordering of the sets of bits (e.g., according to a result value based on a weight assigned to each bit). Simple algorithms (to reduce complexity) can be implemented and an efficient and cost-effective arbitration scheme implemented in hardware can be implemented. Each frame 102 may be appropriately identified and ready for arbitration by hardware device 100, with self-contained arbitration metadata stored in instruction frame 203. For example:
Instruction set = PrioFrame (16 bits) + FrameID (portno+timestamp) +tasks2exec+ ExecNow + SeqNumber +).
Frame arbitration may also take into account the state of one or more hardware accelerators and one or more internal queues of network entity 300, e.g., to improve QoS (e.g., avoid unnecessary message dropping in queues) and/or to optimize the use of shared hardware resources (e.g., management, scheduling, and reuse (looping) of processing tasks in switching, routing, or gateway functions of network entity 300).
Furthermore, the frame arbitration of the present invention can be applied to any type of network. That is, it is protocol independent (due to SDN and decoupling by instruction frame 203). Protocol independent arbitration policies may be a driving factor in improving scalability.
By way of conclusion, by setting all of the above policies and implementations together, autonomous and stateful frame arbitration can be dynamically implemented by the network entity 300 based on the context.
Fig. 9 illustrates a method 900 that may be used for frame arbitration in a network entity 300 in accordance with the present invention. Method 900 is performed by hardware device 100. The method 900 comprises a step 901 of simultaneously receiving at least a first frame 102a and a second frame 102b at a plurality of ingress ports 101 of the hardware device 100. Each frame 102, 200 comprises a frame header 201 comprising a frame priority field 202, wherein the frame priority field 202 comprises a set of bits, wherein the set of bits comprises two or more subsets of bits, and wherein each subset of bits is indicative of respective arbitration metadata.
The method 900 further comprises a step 902 of determining, by the processing circuitry 105 of the hardware device 100, which of the first frame 102a and the second frame 102b to process and output sequentially or concurrently at the plurality of output ports 103 of the hardware device 100 in a next clock cycle based on the set of bits in the frame priority fields 202 of the first frame 102a and the second frame 102b, respectively.
The invention has been described in connection with various embodiments as an example and implementations. Other variations to the claimed subject matter can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the invention, and the independent claims. In the claims and in the description, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims (19)

1. A hardware device (100) for frame arbitration in a network entity (300), the hardware device (100) comprising:
-N ingress ports (101) for receiving N frames (102), wherein N >1;
m output ports (103) for outputting M frames (102), wherein M >1;
-a processing circuit (105) connected to said inlet port (101) and to said outlet port (103);
wherein each frame (102) comprises a frame header (201) comprising a frame priority field (202), wherein the frame priority field (202) comprises a set of bits, and wherein the set of bits comprises two or more subsets of bits, each subset of bits indicating respective arbitration metadata, and
Wherein the processing circuit (105) is configured to, when at least a first frame (102 a) and a second frame (102 b) are received simultaneously at the ingress port (101):
-determining which of the first frame (102 a) and the second frame (102 b) to process and output sequentially or concurrently at the egress port (103) in a next clock cycle based on the set of bits in the frame priority field (202) of the first frame (102 a) and the second frame (102 b), respectively.
2. The hardware device (100) of claim 1, wherein the processing circuit (105) is configured to update the set of bits in the frame priority field (202) of each frame (102) before the frame (102) is processed and output at the output port (103).
3. The hardware device (100) according to claim 2, wherein the set of bits in the frame priority field (202) of the frame (102) is updated based on information contained within the frame (102) and/or based on available information of the hardware device (100) itself and a network environment.
4. A hardware device (100) according to claim 2 or 3, characterized in that the set of bits in the frame priority field (202) of the frame (102) is updated based on at least one of:
receiving a state of a queue of the frames (102);
-a timeout and/or a timestamp of the frame (102);
A list of one or more network processing tasks to be performed on the frame (100);
A state of one or more hardware accelerators, wherein one or more network processing tasks must be performed on the frame (102) before the frame (102) is output;
-a state of a time-sensitive network shaper of the frame (102) to be output;
In-band telemetry status;
virtual local area network tag priority of the frame (102).
5. The hardware device (100) of any of claims 1 to 4, wherein each frame (102) comprises an instruction frame (203) and a data frame (204), wherein the frame header (201) comprising the frame priority field (201) belongs to the instruction frame (203).
6. The hardware device (100) of claim 5, configured to:
Receiving and outputting each instruction frame in a control plane (203);
each data frame is received and output in a data plane (204).
7. The hardware device (100) of any of claims 1 to 4, wherein the processing circuit (105) is configured to separate each frame (102) received at the ingress port (101) into an instruction frame (203) and a data frame (204), wherein the instruction frame (203) is provided with the frame header (201) comprising the frame priority field (202).
8. The hardware device (100) of any one of claims 5 to 7, wherein the processing circuit (105) is further configured to process the data frames (204) of each frame (102) in accordance with instructions included in the frame header (201) of the frame (200) and a payload (205) of the instruction frame (203) before outputting the frame (102) at the egress port (103).
9. The hardware device (100) of any one of claims 1 to 8, wherein the processing circuit (105) is configured to:
Associating weights with the subset of bits in the frame priority field (202) of each frame (102);
Further based on the weights associated with the subset of bits in the frame priority field (202) of the first frame (102 a) and the second frame (102 b), respectively, it is determined by performing an ordering algorithm which of the first frame (102 a) and the second frame (102 b) is output at the output port (103) in the next clock cycle.
10. The hardware device (100) of claim 9, wherein the processing circuit (105) comprises:
A set of configurable multiplexers (800), wherein different configurations of the set of configurable multiplexers (800) correspond to different weights associated with the subset of bits in the frame priority field (202) of each frame (102).
11. The hardware device (100) according to any of claims 1 to 10, characterized in that,
The N ingress ports (101) are connected to N or more parallel queues for providing the N ingress frames (102) in parallel from the N ingress ports (101), and/or
The M egress ports (103) are connected to M or more parallel queues for receiving M of the egress frames (102) in parallel and transmitting them to the M egress ports (102) in parallel.
12. An instruction frame (203) characterized by having a frame format suitable for frame arbitration in a network entity, the frame format comprising:
a frame header (201);
Wherein the frame header (201) comprises a frame priority field (202);
Wherein the frame priority field (202) comprises a set of bits;
wherein the bit set comprises two or more bit subsets, and
Wherein each bit subset indicates corresponding arbitration metadata.
13. The instruction frame (203) of claim 12, wherein the respective arbitration metadata comprises one or more bits indicating a priority of the instruction frame (203).
14. The instruction frame (203) according to claim 12 or 13, wherein the respective arbitration metadata further comprises at least one of:
one or more bits indicating a queue status of a queue providing the instruction frame (203);
One or more bits indicating a timeout or timestamp of the instruction frame (203);
One or more bits indicating a list of one or more processing tasks to be performed on the instruction frame (203);
One or more bits indicating the status of one or more hardware accelerators, wherein one or more network processing tasks must be performed on the instruction frame (203);
One or more bits indicating a time-sensitive network shaper of the instruction frame (203);
one or more bits indicating in-band telemetry status;
One or more bits indicating virtual local area network tag priority of the instruction frame (203).
15. The instruction frame (203) according to any one of claims 12 to 14, wherein the frame priority field (202) is a first field in the frame header (201).
16. The instruction frame (203) according to any one of claims 12 to 15, wherein the frame format is based on one of the following protocols:
-a controller area network (controller area network, CAN);
-CAN flexible data rate;
-CAN XL;
-a local interconnect network;
-FlexRay;
-media oriented system transmission;
-an ethernet network;
-a mobile industry processor interface;
camera serial interface 2.
17. The instruction frame (203) according to any one of claims 12 to 16, wherein the frame format is a standardized frame format comprising a plurality of fields, wherein each field is parameterized by a field index or offset parameter and a field size parameter.
18. The instruction frame (203) according to any one of claims 12 to 17, further comprising a payload (205), wherein the frame header (201) and the payload (205) comprise instructions indicating how a data frame (204) associated with the instruction frame (203) is to be processed by a processing circuit (105) of a hardware device (100).
19. A method (900) for frame arbitration in a network entity (300), characterized in that the method (900) is performed by a hardware device (100) and comprises:
Simultaneously receiving (901) at least a first frame (102 a) and a second frame (102 b) at a plurality of ingress ports (101) of the hardware device (100);
Wherein each frame (102, 102a, 102 b) comprises a frame header (201) comprising a frame priority field (202), wherein the frame priority field (202) comprises a set of bits, and wherein the set of bits comprises two or more subsets of bits, each subset of bits indicating respective arbitration metadata, and
-Determining (902), by a processing circuit (105) of the hardware device (100), which of the first frame (102 a) and the second frame (102 b) to process and output sequentially or concurrently at a plurality of output ports (103) of the hardware device (100) in a next clock cycle based on the set of bits in the frame priority field (202) of the first frame (102 a) and the second frame (102 b), respectively.
CN202280095939.3A 2022-05-11 2022-05-11 Adaptive Traffic Arbitration Engine Pending CN119174155A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/062782 WO2023217364A1 (en) 2022-05-11 2022-05-11 A self-adaptive traffic arbitration engine

Publications (1)

Publication Number Publication Date
CN119174155A true CN119174155A (en) 2024-12-20

Family

ID=81984690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280095939.3A Pending CN119174155A (en) 2022-05-11 2022-05-11 Adaptive Traffic Arbitration Engine

Country Status (3)

Country Link
EP (1) EP4473720A1 (en)
CN (1) CN119174155A (en)
WO (1) WO2023217364A1 (en)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5892766A (en) * 1996-02-22 1999-04-06 Fujitsu, Ltd. Method and apparatus for coordinating access to an output of a routing device in a packet switching network
GB9828144D0 (en) * 1998-12-22 1999-02-17 Power X Limited Data switching apparatus

Also Published As

Publication number Publication date
WO2023217364A1 (en) 2023-11-16
EP4473720A1 (en) 2024-12-11

Similar Documents

Publication Publication Date Title
US20230239368A1 (en) Accelerated network packet processing
US11374858B2 (en) Methods and systems for directing traffic flows based on traffic flow classifications
US11818022B2 (en) Methods and systems for classifying traffic flows based on packet processing metadata
CN113032938B (en) Time-sensitive flow routing scheduling method, device, electronic equipment and medium
CN113767598A (en) System and method for traffic-by-traffic classified routing
EP1774714B1 (en) Hierarchal scheduler with multiple scheduling lanes
US7324541B2 (en) Switching device utilizing internal priority assignments
US7924860B1 (en) Maintaining data unit order in a network switching device
US20070217336A1 (en) Method and system for using a queuing device as a lossless stage in a network device in a communications network
US20050135398A1 (en) Scheduling system utilizing pointer perturbation mechanism to improve efficiency
Park et al. Design optimization of frame preemption in real-time switched Ethernet
Scordino et al. Hardware acceleration of data distribution service (DDS) for automotive communication and computing
Yun et al. Flexible switching architecture with virtual-queue for time-sensitive networking switches
CN119174155A (en) Adaptive Traffic Arbitration Engine
Mariño et al. Loopback strategy for in-vehicle network processing in automotive gateway network on chip
CN114221911B (en) Hybrid scheduling method, device, electronic device and storage medium for messages
EP4102791A1 (en) Data validity based network buffer management system
CN119520416B (en) Ethernet multi-queue traffic scheduling method and device, computer readable storage medium and electronic equipment
US20250007846A1 (en) Hardware device for automatic detection and deployment of qos policies
CN1901510B (en) Method of operating a scheduler of a crossbar switch and scheduler
Li et al. Enhanced Switch Design Supporting Dual Preemption Under Real-Time Ethernet
CN120567952A (en) Hardware multithreading-based distributed RDMA model parallel processing method and system
WO2022253401A1 (en) Device and method for smart queueing
Zhou Latency-Critical Networking
CN120675923A (en) Data transmission method, vehicle-mounted Ethernet electronic and electrical architecture, and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination