US20070014246A1 - Method and system for transparent TCP offload with per flow estimation of a far end transmit window - Google Patents
Method and system for transparent TCP offload with per flow estimation of a far end transmit window Download PDFInfo
- Publication number
- US20070014246A1 US20070014246A1 US11/489,393 US48939306A US2007014246A1 US 20070014246 A1 US20070014246 A1 US 20070014246A1 US 48939306 A US48939306 A US 48939306A US 2007014246 A1 US2007014246 A1 US 2007014246A1
- Authority
- US
- United States
- Prior art keywords
- tcp
- state
- network flow
- received
- determined network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/10—Program control for peripheral devices
- G06F13/12—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor
- G06F13/124—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware is a sequential transfer control unit, e.g. microprocessor, peripheral processor or state-machine
- G06F13/128—Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware is a sequential transfer control unit, e.g. microprocessor, peripheral processor or state-machine for dedicated transfers to a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/19—Flow control; Congestion control at layers above the network layer
- H04L47/193—Flow control; Congestion control at layers above the network layer at the transport layer, e.g. TCP related
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/27—Evaluation or update of window size, e.g. using information derived from acknowledged [ACK] packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/36—Flow control; Congestion control by determining packet size, e.g. maximum transfer unit [MTU]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/41—Flow control; Congestion control by acting on aggregated flows or links
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/9063—Intermediate storage in different physical parts of a node or terminal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
- H04L49/9084—Reactions to storage capacity overflow
- H04L49/9089—Reactions to storage capacity overflow replacing packets in a storage arrangement, e.g. pushout
- H04L49/9094—Arrangements for simultaneous transmit and receive, e.g. simultaneous reading/writing from/to the storage element
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/12—Protocol engines
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/161—Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/163—In-band adaptation of TCP data exchange; In-band control procedures
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/166—IP fragmentation; TCP segmentation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/50—Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate
Definitions
- Certain embodiments of the invention relate to processing of TCP data and related TCP information. More specifically, certain embodiments of the invention relate to a method and system for transparent TCP offload with per flow estimation of a far end transmit window.
- TCP Offload Engine the offloading engine performs all or most of the TCP processing, presenting to the upper layer a stream of data.
- the TTOE is tightly coupled with the operating system and therefore requires solutions that are dependent on the operating system and may require changes in the operating system to support it.
- the TTOE may require a side by side stack solution, requiring some kind of manual configuration, either by the application, for example, by explicitly specifying a socket address family for accelerated connections.
- the TTOE may also require some kind of manual configuration by an IT administrator, for example, by explicitly specifying an IP subnet address for accelerated connections to select which of the TCP flows will be offloaded and the offload engine is very complex as it needs to implement TCP packet processing.
- LSO Large segment offload
- TSO transmit segment offload
- MTU maximum transmission unit
- TSO transmit segment offload
- the host sends to the NIC, bigger transmit units than the maximum transmission unit (MTU) and the NIC cuts them to segments according to the MTU. Since part of the host processing is linear to the number of transmitted units, this reduces the required host processing power. While being efficient in reducing the transmit packet processing, LSO does not help with receive packet processing.
- the host would receive from the far end multiple ACKs, one for each MTU-sized segment. The multiple ACKs require consumption of scarce and expensive bandwidth, thereby reducing throughput and efficiency.
- the TCP flows may be split to multiple hardware queues, according to a hash function that guarantees that a specific TCP flow would always be directed into the same hardware queue.
- the mechanism takes advantage of interrupt coalescing to scan the queue and aggregate subsequent packets on the queue belonging to the same TCP flow into a single large receive unit.
- While this mechanism does not require any additional hardware from the NIC besides multiple hardware queues, it may have various performance limitations. For example, if the number of flows were larger than the number of hardware queues, multiple flows would fall into the same queue, resulting in no LRO aggregation for that queue. If the number of flows is larger than twice the number of hardware queues, no LRO aggregation is performed on any of the flows. The aggregation may be limited to the amount of packets available to the host in one interrupt period. If the interrupt period is short, and the number of flows is not small, the number of packets that are available to the host CPU for aggregation on each flow may be small, resulting in limited or no LRO aggregation, even if the number of hardware queues is large.
- the LRO aggregation may be performed on the host CPU, resulting in additional processing.
- the driver may deliver to the TCP stack a linked list of buffers comprising of a header buffer followed by a series of data buffers, which may require more processing than in the case where all the data is contiguously delivered on one buffer.
- the computational power of the offload engine needs to be very high or at least the system needs a very large buffer to compensate for any additional delays due to the delayed processing of the out-of-order segments.
- additional system memory bandwidth may be consumed when the previously out-of-order segments are copied to respective buffers.
- the additional copying provides a challenge for present memory subsystems, and as a result, these memory subsystems are unable to support high rates such as 10 Gbps.
- TCP segments may arrive out-of-order with respect to the order placed in which they were transmitted. This may prevent or otherwise hinder the immediate processing of the TCP control data and prevent the placing of the data in a host buffer. Accordingly, an implementer may be faced with the option of dropping out-of-order TCP segments or storing the TCP segments locally on the NIC until all the missing segments have been received. Once all the TCP segments have been received, they may be reordered and processed accordingly. In instances where the TCP segments are dropped or otherwise discarded, the sending side may have to re-transmit all the dropped TCP segments and in some instances, may result in about a fifty percent (50%) decrease in throughput or bandwidth utilization.
- a method and/or system for transparent TCP offload with per flow estimation of a far end transmit window substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
- FIG. 1A is a block diagram of an exemplary system for transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention.
- FIG. 1B is a block diagram of another exemplary system for transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention.
- FIG. 1D is a block diagram of a system for handling transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention.
- FIG. 2A is a diagram illustrating exemplary steps that may be utilized for handling out-of-order data when a packet P 3 and a packet P 4 arrive out-of-order with respect to the order of transmission, in accordance with an embodiment of the invention.
- FIG. 2B is a flow chart illustrating exemplary steps for transparent TCP offload with transmit-receive coupling, in accordance with an embodiment of the invention.
- FIG. 3 is a flow chart illustrating exemplary steps for transparent TCP offload with per flow estimation of far end transmit window, in accordance with an embodiment of the invention.
- Certain embodiments of the invention may be found in a method and system for transparent TCP offload. Aspects of the method and system may comprise storing at a network interface card (NIC) processor state information for a received TCP segment and state information for transmitted TCP segments for a determined network flow without transferring state information for the received TCP segment to a host system communicatively coupled to the NIC.
- the generation of a new TCP segment comprising the collected received TCP segments may be controlled based on the occurrence of a termination event and a transmit window size.
- the period of time for aggregation of received TCP segments may be calculated based on the sequence numbers of the next expected TCP segment and the next received acknowledgement packet.
- the generated new TCP segment, new state information for the generated new TCP segment, and the state information for the transmitted TCP segments may be communicated to the host system for TCP offload.
- each of the plurality of TCP segments received would have to be individually processed by a host processor in the host system.
- TCP processing requires extensive CPU processing power in terms of both protocol processing and data placement on the receiver side.
- Current processing systems and methods involve the transfer of TCP state to a dedicated hardware such as a NIC, where significant changes to host TCP stack and/or underlying hardware are required.
- FIG. 1A is a block diagram of an exemplary system for transparent TCP offload, in accordance with an embodiment of the invention. Accordingly, the system of FIG. 1A may be adapted to handle transparent TCP offload of transmission control protocol (TCP) datagrams or packets.
- TCP transmission control protocol
- the system may comprise, for example, a CPU 102 , a memory controller 104 , a host memory 106 , a host interface 108 , network subsystem 110 and an Ethernet 112 .
- the network subsystem 110 may comprise, for example, a transparent TCP-enabled Ethernet Controller (TTEEC) or a transparent TCP offload engine (TTOE) 114 .
- TTEEC transparent TCP-enabled Ethernet Controller
- TTOE transparent TCP offload engine
- the network subsystem 110 may comprise, for example, a network interface card (NIC).
- NIC network interface card
- the host interface 108 may be, for example, a peripheral component interconnect (PCI), PCI-X, PCI-Express, ISA, SCSI or other type of bus.
- the memory controller 106 may be coupled to the CPU 104 , to the memory 106 and to the host interface 108 .
- the host interface 108 may be coupled to the network subsystem 110 via the TTEEC/TTOE 114 .
- FIG. 1B is a block diagram of another exemplary system for transparent TCP offload, in accordance with an embodiment of the invention.
- the system may comprise, for example, a CPU 102 , a host memory 106 , a dedicated memory 116 and a chip set 118 .
- the chip set 118 may comprise, for example, the network subsystem 110 and the memory controller 104 .
- the chip set 118 may be coupled to the CPU 102 , to the host memory 106 , to the dedicated memory 116 and to the Ethernet 112 .
- the network subsystem 110 of the chip set 118 may be coupled to the Ethernet 112 .
- the network subsystem 110 may comprise, for example, the TTEEC/TTOE 114 that may be coupled to the Ethernet 112 .
- the network subsystem 110 may communicate to the Ethernet 112 via a wired and/or a wireless connection, for example.
- the wireless connection may be a wireless local area network (WLAN) connection as supported by the IEEE 802.11 standards, for example.
- the network subsystem 110 may also comprise, for example, an on-chip memory 113 .
- the dedicated memory 1 16 may provide buffers for context and/or data.
- the network subsystem 110 may comprise a processor such as a coalescer 111 .
- the coalescer 111 may comprise suitable logic, circuitry and/or code that may be enabled to handle the accumulation or coalescing of TCP data.
- the coalescer 111 may utilize a flow lookup table (FLT) to maintain information regarding current network flows for which TCP segments are being collected for aggregation.
- the FLT may be stored in, for example, the network subsystem 110 .
- the FLT may comprise at least one of the following: a source IP address, a destination IP address, a source TCP address, a destination TCP address, for example.
- At least two different tables may be utilized, for example, a table comprising a 4-tuple lookup to classify incoming packets according to their flow.
- the 4-tuple lookup table may comprise at least one of the following: a source IP address, a destination IP address, a source TCP address, a destination TCP address, for example.
- a flow context table may comprise state variables utilized for aggregation such as TCP sequence numbers.
- the FLT may also comprise at least one of a host buffer or memory address including a scatter-gather-list (SGL) for non-continuous memory, a cumulative acknowledgments (ACKs), a copy of a TCP header and options, a copy of an IP header and options, a copy of an Ethernet header, and/or accumulated TCP flags, for example.
- the coalescer 111 may be enabled to generate a single aggregated TCP segment from the accumulated or collected TCP segments when a termination event occurs. The single aggregated TCP segment may be communicated to the host memory 106 , for example.
- the coalescer 111 may be a separate integrated chip from the chip set 118 embedded on a motherboard or may be embedded in a NIC.
- the dedicated memory 116 may be integrated with the chip set 118 or may be integrated with the network subsystem 110 of FIG. 1B .
- FIG. 1C is an alternative embodiment of an exemplary system for transparent TCP offload, in accordance with an embodiment of the invention.
- a host processor 124 a host memory/buffer 126 , a software algorithm block 134 and a NIC block 128 .
- the NIC block 128 may comprise a NIC processor 130 , a processor such as a coalescer 131 and a reduced NIC memory/buffer block 132 .
- the NIC block 128 may communicate with an external network via a wired and/or a wireless connection, for example.
- the wireless connection may be a wireless local area network (WLAN) connection as supported by the IEEE 802.11 standards, for example.
- WLAN wireless local area network
- the coalescer 131 may be a dedicated processor or hardware state machine that may reside in the packet-receiving path.
- the host TCP stack may comprise software that enables management of the TCP protocol processing and may be part of an operating system, such as Microsoft Windows or Linux.
- the coalescer 131 may comprise suitable logic, circuitry and/or code that may enable accumulation or coalescing of TCP data.
- the coalescer 131 may utilize a flow lookup table (FLT) to maintain information regarding current network flows for which TCP segments are being collected for aggregation.
- the FLT may be stored in, for example, the reduced NIC memory/buffer block 132 .
- the coalescer 131 may enable generation of a single aggregated TCP segment from the accumulated or collected TCP segments when a termination event occurs.
- the single aggregated TCP segment may be communicated to the host memory/buffer 126 , for example.
- providing a single aggregated TCP segment to the host for TCP processing significantly reduces overhead processing by the host 124 .
- dedicated hardware such as a NIC 128 may assist with the processing of received TCP segments by coalescing or aggregating multiple received TCP segments so as to reduce per-packet processing overhead.
- TCP processing systems it is necessary to know certain information about a TCP connection prior to arrival of a first segment for that TCP connection.
- an offload mechanism may be provided that is stateless from the host stack perspective, while state-full from the offload device perspective, achieving comparable performance gain when compared to TTOE.
- Transparent TCP offload reduces the host processing power required for TCP by allowing the host system to process both receive and transmit data units that are bigger than a MTU.
- 64 KB of processing data units may be processed rather than 1.5 KB PDUs in order to produce a significant reduction in the packet rate, thus reducing the host processing power for packet processing.
- TTO no handshake may be utilized between the host operating system and the NIC containing the TTO engine.
- the TTO engine may operate autonomously in identifying new flows and for offloading.
- the offload on the transmit side may be similar to LSO, where the host sends big transmission units and the TTO engine cuts them to smaller transmitted packets according to maximum segment size (MSS).
- MSS maximum segment size
- Transparent TCP offload on the receive side may be performed by aggregating a plurality of received packets of the same flow and delivering them to the host as if they were received in one packet—one bigger packet in the case of received data packets, and one aggregate ACK packet in the case of received ACK packets.
- the processing in the host may be similar to the processing of a big packet that was received.
- rules may be defined to determine whether to aggregate packets. The aggregation rules may be established to allow as much aggregation as possible, without increasing the round trip time such that the decision whether to aggregate depends on the data that is received and the importance of delivering it to the host without delay.
- the aggregation may be implemented with transmit-receive coupling, wherein the transmitter and receiver are coupled, by utilizing transmission information for offload decisions, and the flow may be treated as a bidirectional flow.
- the context information of the receive offload in TTO may be maintained per flow. In this regard, for every received packet, the incoming packet header may be utilized to detect the flow it belongs to and this packet updates the context of the flow.
- the transmitted network packets may be searched along with the received network packets to determine the particular network flow to which the packet belongs.
- the transmitted network packet may enable updating of the context of the flow, which may be utilized for receive offload.
- the frame parser 143 may comprise suitable logic, circuitry and/or code that may enable L 2 Ethernet processing including, for example, address filtering, frame validity and error detection of the incoming frames 141 .
- the next stage of processing may comprise, for example, L 3 such as IP processing and L 4 such as TCP processing within the frame parser 143 .
- the TTEEC 114 may reduce the host CPU 102 utilization and memory bandwidth, for example, by processing traffic on coalesced TCP/IP flows.
- the TTEEC 114 may detect, for example, the protocol to which incoming packets belong based on the packet parsing information and tuple 145 .
- the TTEEC 114 may detect whether the packet corresponds to an offloaded TCP flow, for example, a flow for which at least some TCP state information may be kept by the TTEEC 114 . If the packet corresponds to an offloaded connection, then the TTEEC 114 may direct data movement of the data payload portion of the frame. The destination of the payload data may be determined from the flow state information in combination with direction information within the frame. The destination may be a host memory 106 , for example. Finally, the TTEEC 114 may update its internal TCP and higher levels of flow state, without any coordination with the state of the connection on the host TCP stack, and may obtain the host buffer address and length from its internal flow state.
- an offloaded TCP flow for example, a flow for which at least some TCP state information may be kept by the TTEEC 114 . If the packet corresponds to an offloaded connection, then the TTEEC 114 may direct data movement of the data payload portion of the frame. The destination of the payload data may be determined from the flow state information in
- the receive system architecture may comprise, for example, a control path processing 140 and data movement engine 142 .
- the system components above the control path as illustrated in upper portion of FIG. 1D may be designed to deal with the various processing stages used to complete, for example, the L 3 /L 4 or higher processing with maximal flexibility and efficiency and targeting wire speed.
- the result of the stages of processing may comprise, for example, one or more packet identification cards that may provide a control structure that may carry information associated with the frame payload data. This may have been generated inside the TTEEC 114 while processing the packet in the various blocks.
- a data path 142 may move the payload data portions or raw packets 155 of a frame along from, for example, an on-chip packet frame buffer 154 and upon control processing completion, to a direct memory access (DMA) engine 163 and subsequently to the host buffer 167 via the host bus 165 that was chosen via processing.
- the data path 142 to the DMA engine may comprise packet data are and optional headers 161 .
- the receiving system may perform, for example, one or more of the following: parsing the TCP/IP headers 145 ; associating the frame with a TCP/IP flow in the association block 149 ; fetching the TCP flow context in the context fetch block 151 ; processing the TCP/IP headers in the RX processing block 150 ; determining header/data boundaries and updating state 153 ; mapping the data to a host buffers; and transferring the data via a DMA engine 163 into these host buffers 167 .
- the headers may be consumed on chip or transferred to the host buffers 167 via the DMA engine 163 .
- the packet frame buffer 154 may be an optional block in the receive system architecture. It may be utilized for the same purpose as, for example, a first-in-first-out (FIFO) data structure is used in a conventional L 2 NIC or for storing higher layer traffic for additional processing.
- the packet frame buffer 154 in the receive system may not be limited to a single instance.
- the data path 142 may store the data between data processing stages one or more times.
- coalescing operations described for the coalescer 111 in FIG. 1B and/or for the coalescer 131 in FIG. 1C may be implemented in a coalescer 152 in the RX processing block 150 in FIG. 1D .
- buffering or storage of TCP data may be performed by, for example, the frame buffer 154 .
- the FLT utilized by the coalescer 152 may be implemented using the off-chip storage 160 and/or the on-chip storage 162 , for example.
- a new flow may be detected at some point during the flow lifetime.
- the flow state is unknown when the new flow is detected and the first packets are utilized to update the flow state until the flow is known to be in-order.
- a device performing TTO may also support other offload types, for example, TOE, RDMA, or iSCSI offload.
- the FLT for TTO may be shared with the connection search for other offload types with each entry in the FLT indicating the offload type for that flow. Packets that belong to flows of other offload types may not be candidates for TTO.
- the flow Upon detecting a new flow, the flow may be initiated with the basic initialization context. An entry in the FLT with a flow ID may be created.
- a plurality of segments of the same flow may be aggregated in TTO up to a receive aggregation length (RAL), presenting to the host a bigger segment for processing.
- RAL receive aggregation length
- the received packet may be placed in the host memory 126 but will not be delivered to the host. Instead, the host processor 124 may update the context of the flow this packet belongs to.
- the new incoming packet may either cause the packet to be delivered immediately alone if there were no prior aggregated packets that were not delivered or as a single packet that represents both that packet and the previously received packets.
- the packet may not be delivered but may update the flow's context.
- a termination event may occur and the packet may not be aggregated if at least one of the following occurs at the TCP level: (1) the data is not in-order as derived from the received sequence number (SN) and the flow's context; (2) at least one packet with TCP flags other than ACK flag, for example, a PUSH flag is detected; (3) at least one packet with selective acknowledgement (SACK) information is detected; or (4) if the ACK SN received is bigger than the delivered ACK SN, and requires stopping the aggregation.
- a termination event may occur and the packet may not be aggregated if at least one of the following occurs at the IP level: (1) the type of service (TOS) field in the IP header is different than the TOS field of the previous packets that were aggregated; or (2) the received packet is an IP fragment.
- TOS type of service
- the aggregated packet's header may contain the aggregated header of all the individual packets it contains.
- a plurality of TCP rules for the aggregation may be as follows.
- the SN in the aggregated header is the SN of the first or oldest packet; (2) the ACK SN is the SN of the last or youngest segment; (3) the length of the aggregated header is the sum of the lengths of all the aggregated packets; (4) the window in the aggregated header is the window received in the last or youngest aggregated packet; (5) the time stamp (TS) in the aggregated header is the TS received in the first or oldest aggregated packet; (6) the TS-echo in the aggregated header is the TS-echo received in the first or oldest aggregated packet; and (7) the checksum in the aggregated header is the accumulated checksum of all aggregated packets.
- TS time stamp
- a plurality of IP field aggregation rules may be provided.
- the TOS of the aggregated header may be that of all the aggregated packets;
- the time-to-live (TTL) of the aggregated header is the minimum of all incoming TTLs;
- the length of the aggregated header is the sum of the lengths in the aggregated packets;
- the fragment offset of the aggregated header may be zero for aggregated packets; and
- the packet ID of the aggregated header is the last ID received.
- the received packets may be aggregated until the received packet cannot be aggregated due to the occurrence of a termination event, or if a timeout has expired on that flow, or if the aggregated packet exceeds RAL.
- the timeout may be implemented by setting a timeout to a value, timeout aggregation value, when the first packet on a flow is placed without delivery.
- the following packets that are aggregated may not change the timeout.
- the timeout may be canceled and may be set again in the next first packet that is not delivered.
- other embodiments of the invention may provide timeout implementation by periodically scanning all the flows.
- the received ACK SN may be relevant to determine the rules to aggregate pure ACKs and to determine the rules to stop aggregation of packets with data due to the received ACK SN.
- the duplicated pure ACKs may never be aggregated. When duplicated pure ACKs are received, they may cause prior aggregated packets to be delivered and the pure ACK may be delivered immediately separately.
- the received ACK SN may also be utilized to stop the aggregation and deliver the pending aggregated packet to the host TCP/IP stack.
- a plurality of rules may be provided for stopping the aggregation according to the ACK SN. For example, (1) if the number of acknowledged (ACKed) bytes that are not yet delivered, taking into account the received segments and the prior segments that were not delivered exceeds a threshold, ReceiveAckedBytesAggretation, for example, in bytes; or (2) the time from the arrival of the first packet that advanced the received ACK SN exceeds a threshold, TimeoutAckAggregation, for example.
- a second timer per flow may be required or other mechanisms, such as periodically scanning the flows may be implemented.
- the flows may be removed from the host memory if one of the following occurs: (1) a reset (RST) flag was detected in the receive side; (2) a finish (FIN) flag was detected in the receive side; (3) there was no receive activity on the flow for a predefined time TerminateNoActivityTime, for example; (4) a KeepAlive packet in the receive direction was not acknowledged.
- RST reset
- FIN finish
- a least recently used (LRU) cache may be used instead of a timeout rule to remove the flows from the host memory.
- the flows may be removed from the host memory if the flow was closed due to a retransmission timeout that requires information from the transmitter.
- retransmission timeout may comprise periodically scanning all the flows to determine if any flow is closed. The period for scanning may be low, for example, 5 seconds.
- the maximum transmitted sequence number (SN) may be recorded.
- the maximum received SN may be recorded. If in two consequent scans there is pending data on same flow of the same type with the recorded number unchanged, pending data that was not acknowledged for the entire scan period may be indicated. In this case the flow may be removed.
- FIG. 2A is a diagram illustrating exemplary steps that may be utilized for handling out-of-order data when a packet P 3 and a packet P 4 arrive out-of-order with respect to the order of transmission, in accordance with an embodiment of the invention.
- the packets P 3 and P 4 may arrive in-order with respect to each other at the NIC 128 but before the arrival of a packet P 2 , as shown in the actual receive RX traffic pattern 200 .
- the packets P 3 and P 4 may correspond to a fourth packet and a fifth packet within an isle 211 , respectively, in a TCP transmission sequence.
- a first disjoint portion in the TCP transmission sequence may result from the arrival of the packets P 3 and P 4 as shown in the TCP receive sequence space 202 after the isle 213 comprising packets P 0 and P 1 .
- the rightmost portion of isle 211 rcv_nxt_R may be represented as (rcv_nxt_L+(length of isle)), where rcv_nxt_L is the leftmost portion of isle 211 and the length of isle is the sum of the lengths of packets P 3 and P 4 .
- FIG. 2B is a state diagram illustrating exemplary transparent TCP offload with transmit-receive coupling, in accordance with an embodiment of the invention.
- a plurality of exemplary flow states namely, in order state 226 , Out-Of-Order (OOO) state 224 , or an unknown state 222 .
- OOO Out-Of-Order
- the unknown state 222 may be detected for a flow for which a 3-way TCP handshake has not been detected or at some point in the life of the flow other than the initialization phase.
- the offload engine may also track outgoing and incoming TCP segments with a set synchronous (SYN) flag to detect a new flow.
- the exemplary transition states may be implemented as a state machine.
- the TCP 3-way handshake begins with a synchronize (SYN) segment containing an initial send sequence number (ISN) being chosen by, and sent from a first host.
- This sequence number may be the starting sequence number of the data in that packet and may increment for every byte of data sent within the segment.
- the second host may transmit a SYN segment with its own totally independent ISN number in the sequence number field along with an acknowledgment field.
- This acknowledgment (ACK) field may inform the recipient that its data was received at the other end and it expects the next segment of data bytes to be sent, and may be referred to as the SYN-ACK.
- the first host When the first host receives this SYN-ACK segment it may send an ACK segment containing the next sequence number, called forward acknowledgement and is received by the second host.
- the ACK segment may be identified by the ACK field being set. Segments that are not acknowledged within a certain period of time may be retransmitted.
- the state diagram may track the out-of-order isle sequence number boundaries using, for example, the parameters rcv_nxt_R and rcv_nxt_L as illustrated in FIG. 2C .
- the first ingress segment may be referred to as an isle, for example, isle 213 ( FIG. 2C ) and the ordering state may be set to OOO state 224 .
- the rightmost portion of isle rcv_nxt_R may be represented as (rcv_nxt_L+(length of isle)), where rcv_nxt_L is the leftmost portion of isle and the length of isle is the sum of the lengths of packets in the isle.
- the NIC 128 may access the local stack acknowledgment information as the transmitter and receiver are coupled.
- the ordering state may be modified from OOO state 224 to in-order state 226 whenever an egress ACK sequence number is greater than an isle length of at least one TCP segment.
- the initial ordering state may be set to the in order state 226 , if the new flow is detected with the TCP 3-way handshake.
- the ordering state may be modified from in order state 226 to OOO state 224 if the isle length is not equal to rcv_nxt_R.
- the number of ACKed bytes that have not yet been delivered may exceed a fraction of the pending transmitted bytes that were not ACKed.
- the pending transmitted bytes count may be calculated as the difference between sndMax, the most advance sequence number (SN) that was transmitted and the last received ACK SN that was delivered.
- the number of ACKed bytes may exceed a dynamic threshold. This threshold may depend on the size of the packets that were transmitted to the peer. The sizes of the transmitted packets or the size of the transmission units that were sent to the chip to be transmitted and were not yet ACKed may be recorded. In case of LSO, the aggregation may be set to ACK blocks of data similar to the transmitted data units.
- FIG. 3 is a flow chart illustrating exemplary steps for transparent TCP offload with per flow estimation of far end transmit window, in accordance with an embodiment of the invention.
- each of the received TCP segments and the transmitted TCP segments may be monitored to determine which network flow they belong to based on their respective header information.
- it may be determined whether the received TCP segments are for a particular network flow. If the received TCP segments are not for a particular network flow, control passes to step 302 . If the received TCP segments are for a particular network flow, control passes to step 304 .
- a network interface card (NIC) processor 130 may enable collection of at least one received TCP segment for a determined network flow.
- NIC network interface card
- the NIC 128 enables storage of state information for the received TCP segment and state information for transmitted TCP segments for the determined network flow without transferring state information for the received TCP segment and the state information for the transmitted TCP segments to a host system 124 communicatively coupled to the NIC 128 .
- the NIC 128 may determine the time period over which the received TCP segments are aggregated before transmitting them to the host system 124 .
- the time period for aggregation may be a minimum of a time period for a termination event to occur and a time period for the far-end effective transmit window.
- the far-end effective transmit window may be calculated as a maximum value of (rcv_nxt_R-ack_sn) observed since the flow started in an in-order state, where rcv_nxt_R may represent the sequence number of the next expected TCP segment and ack_sn may represent a sequence number of the next received acknowledgement packet from the host system.
- a new TCP segment may be generated by aggregating at least a portion of a plurality of the collected TCP segments for the determined network flow over the determined period of time.
- the NIC 128 enables communication of the generated new TCP segment, new state information for the new TCP segment, and the state information for the transmitted TCP segments to the host system 124 for TCP offload.
- Another embodiment of the invention may provide a machine-readable storage, having stored thereon, a computer program having at least one code section executable by a machine, thereby causing the machine to perform the steps as described above for transparent TCP offload with per flow estimation of a far end transmit window.
- the present invention may be realized in hardware, software, or a combination of hardware and software.
- the present invention may be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited.
- a typical combination of hardware and software may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
- the present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods.
- Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Communication Control (AREA)
- Computer And Data Communications (AREA)
Abstract
Description
- Each of the above referenced applications is hereby incorporated herein by reference in their entirety.
- Certain embodiments of the invention relate to processing of TCP data and related TCP information. More specifically, certain embodiments of the invention relate to a method and system for transparent TCP offload with per flow estimation of a far end transmit window.
- There are different approaches for reducing the processing power of TCP/IP stack processing. In a TCP Offload Engine (TOE), the offloading engine performs all or most of the TCP processing, presenting to the upper layer a stream of data. There may be various disadvantages to this approach. The TTOE is tightly coupled with the operating system and therefore requires solutions that are dependent on the operating system and may require changes in the operating system to support it. The TTOE may require a side by side stack solution, requiring some kind of manual configuration, either by the application, for example, by explicitly specifying a socket address family for accelerated connections. The TTOE may also require some kind of manual configuration by an IT administrator, for example, by explicitly specifying an IP subnet address for accelerated connections to select which of the TCP flows will be offloaded and the offload engine is very complex as it needs to implement TCP packet processing.
- Large segment offload (LSO)/transmit segment offload (TSO) may be utilized to reduce the required host processing power by reducing the transmit packet processing. In this approach the host sends to the NIC, bigger transmit units than the maximum transmission unit (MTU) and the NIC cuts them to segments according to the MTU. Since part of the host processing is linear to the number of transmitted units, this reduces the required host processing power. While being efficient in reducing the transmit packet processing, LSO does not help with receive packet processing. In addition, for each single large transmit unit sent by the host, the host would receive from the far end multiple ACKs, one for each MTU-sized segment. The multiple ACKs require consumption of scarce and expensive bandwidth, thereby reducing throughput and efficiency.
- In large receive offload (LRO), a stateless receive offload mechanism, the TCP flows may be split to multiple hardware queues, according to a hash function that guarantees that a specific TCP flow would always be directed into the same hardware queue. For each hardware queue, the mechanism takes advantage of interrupt coalescing to scan the queue and aggregate subsequent packets on the queue belonging to the same TCP flow into a single large receive unit.
- While this mechanism does not require any additional hardware from the NIC besides multiple hardware queues, it may have various performance limitations. For example, if the number of flows were larger than the number of hardware queues, multiple flows would fall into the same queue, resulting in no LRO aggregation for that queue. If the number of flows is larger than twice the number of hardware queues, no LRO aggregation is performed on any of the flows. The aggregation may be limited to the amount of packets available to the host in one interrupt period. If the interrupt period is short, and the number of flows is not small, the number of packets that are available to the host CPU for aggregation on each flow may be small, resulting in limited or no LRO aggregation, even if the number of hardware queues is large. The LRO aggregation may be performed on the host CPU, resulting in additional processing. The driver may deliver to the TCP stack a linked list of buffers comprising of a header buffer followed by a series of data buffers, which may require more processing than in the case where all the data is contiguously delivered on one buffer.
- Accordingly, the computational power of the offload engine needs to be very high or at least the system needs a very large buffer to compensate for any additional delays due to the delayed processing of the out-of-order segments. When host memory is used for temporary storage of out-of-order segments, additional system memory bandwidth may be consumed when the previously out-of-order segments are copied to respective buffers. The additional copying provides a challenge for present memory subsystems, and as a result, these memory subsystems are unable to support high rates such as 10 Gbps.
- In general, one challenge faced by TCP implementers wishing to design a flow-through NIC, is that TCP segments may arrive out-of-order with respect to the order placed in which they were transmitted. This may prevent or otherwise hinder the immediate processing of the TCP control data and prevent the placing of the data in a host buffer. Accordingly, an implementer may be faced with the option of dropping out-of-order TCP segments or storing the TCP segments locally on the NIC until all the missing segments have been received. Once all the TCP segments have been received, they may be reordered and processed accordingly. In instances where the TCP segments are dropped or otherwise discarded, the sending side may have to re-transmit all the dropped TCP segments and in some instances, may result in about a fifty percent (50%) decrease in throughput or bandwidth utilization.
- Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.
- A method and/or system for transparent TCP offload with per flow estimation of a far end transmit window, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
- These and other advantages, aspects and novel features of the present invention, as well as details of an illustrated embodiment thereof, will be more fully understood from the following description and drawings.
-
FIG. 1A is a block diagram of an exemplary system for transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention. -
FIG. 1B is a block diagram of another exemplary system for transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention. -
FIG. 1C is an alternative embodiment of an exemplary system for transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention. -
FIG. 1D is a block diagram of a system for handling transparent TCP offload with transmit and receive coupling, in accordance with an embodiment of the invention. -
FIG. 2A is a diagram illustrating exemplary steps that may be utilized for handling out-of-order data when a packet P3 and a packet P4 arrive out-of-order with respect to the order of transmission, in accordance with an embodiment of the invention. -
FIG. 2B is a flow chart illustrating exemplary steps for transparent TCP offload with transmit-receive coupling, in accordance with an embodiment of the invention. -
FIG. 3 is a flow chart illustrating exemplary steps for transparent TCP offload with per flow estimation of far end transmit window, in accordance with an embodiment of the invention. - Certain embodiments of the invention may be found in a method and system for transparent TCP offload. Aspects of the method and system may comprise storing at a network interface card (NIC) processor state information for a received TCP segment and state information for transmitted TCP segments for a determined network flow without transferring state information for the received TCP segment to a host system communicatively coupled to the NIC. The generation of a new TCP segment comprising the collected received TCP segments may be controlled based on the occurrence of a termination event and a transmit window size. The period of time for aggregation of received TCP segments may be calculated based on the sequence numbers of the next expected TCP segment and the next received acknowledgement packet. The generated new TCP segment, new state information for the generated new TCP segment, and the state information for the transmitted TCP segments may be communicated to the host system for TCP offload.
- Under conventional processing, each of the plurality of TCP segments received would have to be individually processed by a host processor in the host system. TCP processing requires extensive CPU processing power in terms of both protocol processing and data placement on the receiver side. Current processing systems and methods involve the transfer of TCP state to a dedicated hardware such as a NIC, where significant changes to host TCP stack and/or underlying hardware are required.
-
FIG. 1A is a block diagram of an exemplary system for transparent TCP offload, in accordance with an embodiment of the invention. Accordingly, the system ofFIG. 1A may be adapted to handle transparent TCP offload of transmission control protocol (TCP) datagrams or packets. Referring toFIG. 1A , the system may comprise, for example, aCPU 102, amemory controller 104, ahost memory 106, ahost interface 108,network subsystem 110 and anEthernet 112. Thenetwork subsystem 110 may comprise, for example, a transparent TCP-enabled Ethernet Controller (TTEEC) or a transparent TCP offload engine (TTOE) 114. Thenetwork subsystem 110 may comprise, for example, a network interface card (NIC). Thehost interface 108 may be, for example, a peripheral component interconnect (PCI), PCI-X, PCI-Express, ISA, SCSI or other type of bus. Thememory controller 106 may be coupled to theCPU 104, to thememory 106 and to thehost interface 108. Thehost interface 108 may be coupled to thenetwork subsystem 110 via the TTEEC/TTOE 114. -
FIG. 1B is a block diagram of another exemplary system for transparent TCP offload, in accordance with an embodiment of the invention. Referring toFIG. 1B , the system may comprise, for example, aCPU 102, ahost memory 106, adedicated memory 116 and achip set 118. The chip set 118 may comprise, for example, thenetwork subsystem 110 and thememory controller 104. The chip set 118 may be coupled to theCPU 102, to thehost memory 106, to thededicated memory 116 and to theEthernet 112. Thenetwork subsystem 110 of the chip set 118 may be coupled to theEthernet 112. Thenetwork subsystem 110 may comprise, for example, the TTEEC/TTOE 114 that may be coupled to theEthernet 112. Thenetwork subsystem 110 may communicate to theEthernet 112 via a wired and/or a wireless connection, for example. The wireless connection may be a wireless local area network (WLAN) connection as supported by the IEEE 802.11 standards, for example. Thenetwork subsystem 110 may also comprise, for example, an on-chip memory 113. The dedicated memory 1 16 may provide buffers for context and/or data. - The
network subsystem 110 may comprise a processor such as a coalescer 111. The coalescer 111 may comprise suitable logic, circuitry and/or code that may be enabled to handle the accumulation or coalescing of TCP data. In this regard, the coalescer 111 may utilize a flow lookup table (FLT) to maintain information regarding current network flows for which TCP segments are being collected for aggregation. The FLT may be stored in, for example, thenetwork subsystem 110. The FLT may comprise at least one of the following: a source IP address, a destination IP address, a source TCP address, a destination TCP address, for example. In an alternative embodiment of the invention, at least two different tables may be utilized, for example, a table comprising a 4-tuple lookup to classify incoming packets according to their flow. The 4-tuple lookup table may comprise at least one of the following: a source IP address, a destination IP address, a source TCP address, a destination TCP address, for example. A flow context table may comprise state variables utilized for aggregation such as TCP sequence numbers. - The FLT may also comprise at least one of a host buffer or memory address including a scatter-gather-list (SGL) for non-continuous memory, a cumulative acknowledgments (ACKs), a copy of a TCP header and options, a copy of an IP header and options, a copy of an Ethernet header, and/or accumulated TCP flags, for example. The coalescer 111 may be enabled to generate a single aggregated TCP segment from the accumulated or collected TCP segments when a termination event occurs. The single aggregated TCP segment may be communicated to the
host memory 106, for example. - Although illustrated, for example, as a CPU and an Ethernet, the present invention need not be so limited to such examples and may employ, for example, any type of processor and any type of data link layer or physical media, respectively. Accordingly, although illustrated as coupled to the
Ethernet 112, the TTEEC or theTTOE 114 ofFIG. 1A may be adapted for any type of data link layer or physical media. Furthermore, the present invention also contemplates different degrees of integration and separation between the components illustrated in FIGS. 1A-B. For example, theTTEECFTTOE 114 may be a separate integrated chip from the chip set 118 embedded on a motherboard or may be embedded in a NIC. Similarly, the coalescer 111 may be a separate integrated chip from the chip set 118 embedded on a motherboard or may be embedded in a NIC. In addition, thededicated memory 116 may be integrated with the chip set 118 or may be integrated with thenetwork subsystem 110 ofFIG. 1B . -
FIG. 1C is an alternative embodiment of an exemplary system for transparent TCP offload, in accordance with an embodiment of the invention. Referring toFIG. 1C , there is shown ahost processor 124, a host memory/buffer 126, asoftware algorithm block 134 and aNIC block 128. TheNIC block 128 may comprise aNIC processor 130, a processor such as acoalescer 131 and a reduced NIC memory/buffer block 132. TheNIC block 128 may communicate with an external network via a wired and/or a wireless connection, for example. The wireless connection may be a wireless local area network (WLAN) connection as supported by the IEEE 802.11 standards, for example. - The
coalescer 131 may be a dedicated processor or hardware state machine that may reside in the packet-receiving path. The host TCP stack may comprise software that enables management of the TCP protocol processing and may be part of an operating system, such as Microsoft Windows or Linux. Thecoalescer 131 may comprise suitable logic, circuitry and/or code that may enable accumulation or coalescing of TCP data. In this regard, thecoalescer 131 may utilize a flow lookup table (FLT) to maintain information regarding current network flows for which TCP segments are being collected for aggregation. The FLT may be stored in, for example, the reduced NIC memory/buffer block 132. Thecoalescer 131 may enable generation of a single aggregated TCP segment from the accumulated or collected TCP segments when a termination event occurs. The single aggregated TCP segment may be communicated to the host memory/buffer 126, for example. - In accordance with certain embodiments of the invention, providing a single aggregated TCP segment to the host for TCP processing significantly reduces overhead processing by the
host 124. Furthermore, since there is no transfer of TCP state information, dedicated hardware such as aNIC 128 may assist with the processing of received TCP segments by coalescing or aggregating multiple received TCP segments so as to reduce per-packet processing overhead. - In conventional TCP processing systems, it is necessary to know certain information about a TCP connection prior to arrival of a first segment for that TCP connection. In accordance with various embodiments of the invention, it is not necessary to know about the TCP connection prior to arrival of the first TCP segment since the TCP state or context information is still solely managed by the host TCP stack and there is no transfer of state information between the hardware stack and the software stack at any given time.
- In an embodiment of the invention, an offload mechanism may be provided that is stateless from the host stack perspective, while state-full from the offload device perspective, achieving comparable performance gain when compared to TTOE. Transparent TCP offload (TTO) reduces the host processing power required for TCP by allowing the host system to process both receive and transmit data units that are bigger than a MTU. In an exemplary embodiment of the invention, 64 KB of processing data units (PDUs) may be processed rather than 1.5 KB PDUs in order to produce a significant reduction in the packet rate, thus reducing the host processing power for packet processing.
- In TTO, no handshake may be utilized between the host operating system and the NIC containing the TTO engine. The TTO engine may operate autonomously in identifying new flows and for offloading. The offload on the transmit side may be similar to LSO, where the host sends big transmission units and the TTO engine cuts them to smaller transmitted packets according to maximum segment size (MSS).
- Transparent TCP offload on the receive side may be performed by aggregating a plurality of received packets of the same flow and delivering them to the host as if they were received in one packet—one bigger packet in the case of received data packets, and one aggregate ACK packet in the case of received ACK packets. The processing in the host may be similar to the processing of a big packet that was received. In the case of TCP flow aggregation, rules may be defined to determine whether to aggregate packets. The aggregation rules may be established to allow as much aggregation as possible, without increasing the round trip time such that the decision whether to aggregate depends on the data that is received and the importance of delivering it to the host without delay. The aggregation may be implemented with transmit-receive coupling, wherein the transmitter and receiver are coupled, by utilizing transmission information for offload decisions, and the flow may be treated as a bidirectional flow. The context information of the receive offload in TTO may be maintained per flow. In this regard, for every received packet, the incoming packet header may be utilized to detect the flow it belongs to and this packet updates the context of the flow.
- When the transmitter and receiver are coupled, the transmitted network packets may be searched along with the received network packets to determine the particular network flow to which the packet belongs. The transmitted network packet may enable updating of the context of the flow, which may be utilized for receive offload.
-
FIG. 1D is a block diagram of a system for handling transparent TCP offload, in accordance with an embodiment of the invention. Referring toFIG. 1D , there is shown anincoming packet frame 141, aframe parser 143, anassociation block 149, a context fetchblock 151, a plurality of on-chip cache blocks 147, a plurality of off-chip storage blocks 160, a plurality of on-chip storage blocks 162, aRX processing block 150, aframe buffer 154, aDMA engine 163, aTCP code block 157, ahost bus 165, and a plurality of host buffers 167. TheRX processing block 150 may comprise acoalescer 152. - The
frame parser 143 may comprise suitable logic, circuitry and/or code that may enable L2 Ethernet processing including, for example, address filtering, frame validity and error detection of theincoming frames 141. Unlike an ordinary Ethernet controller, the next stage of processing may comprise, for example, L3 such as IP processing and L4 such as TCP processing within theframe parser 143. TheTTEEC 114 may reduce thehost CPU 102 utilization and memory bandwidth, for example, by processing traffic on coalesced TCP/IP flows. TheTTEEC 114 may detect, for example, the protocol to which incoming packets belong based on the packet parsing information andtuple 145. If the protocol is TCP, then theTTEEC 114 may detect whether the packet corresponds to an offloaded TCP flow, for example, a flow for which at least some TCP state information may be kept by theTTEEC 114. If the packet corresponds to an offloaded connection, then theTTEEC 114 may direct data movement of the data payload portion of the frame. The destination of the payload data may be determined from the flow state information in combination with direction information within the frame. The destination may be ahost memory 106, for example. Finally, theTTEEC 114 may update its internal TCP and higher levels of flow state, without any coordination with the state of the connection on the host TCP stack, and may obtain the host buffer address and length from its internal flow state. - The receive system architecture may comprise, for example, a control path processing 140 and
data movement engine 142. The system components above the control path as illustrated in upper portion ofFIG. 1D , may be designed to deal with the various processing stages used to complete, for example, the L3/L4 or higher processing with maximal flexibility and efficiency and targeting wire speed. The result of the stages of processing may comprise, for example, one or more packet identification cards that may provide a control structure that may carry information associated with the frame payload data. This may have been generated inside theTTEEC 114 while processing the packet in the various blocks. Adata path 142 may move the payload data portions orraw packets 155 of a frame along from, for example, an on-chippacket frame buffer 154 and upon control processing completion, to a direct memory access (DMA)engine 163 and subsequently to thehost buffer 167 via thehost bus 165 that was chosen via processing. Thedata path 142 to the DMA engine may comprise packet data are andoptional headers 161. - The receiving system may perform, for example, one or more of the following: parsing the TCP/
IP headers 145; associating the frame with a TCP/IP flow in theassociation block 149; fetching the TCP flow context in the context fetchblock 151; processing the TCP/IP headers in theRX processing block 150; determining header/data boundaries and updatingstate 153; mapping the data to a host buffers; and transferring the data via aDMA engine 163 into these host buffers 167. The headers may be consumed on chip or transferred to the host buffers 167 via theDMA engine 163. - The
packet frame buffer 154 may be an optional block in the receive system architecture. It may be utilized for the same purpose as, for example, a first-in-first-out (FIFO) data structure is used in a conventional L2 NIC or for storing higher layer traffic for additional processing. Thepacket frame buffer 154 in the receive system may not be limited to a single instance. Ascontrol path 140 processing is performed, thedata path 142 may store the data between data processing stages one or more times. - In an exemplary embodiment of the invention, at least a portion of the coalescing operations described for the coalescer 111 in
FIG. 1B and/or for thecoalescer 131 inFIG. 1C may be implemented in acoalescer 152 in theRX processing block 150 inFIG. 1D . In this instance, buffering or storage of TCP data may be performed by, for example, theframe buffer 154. Moreover, the FLT utilized by thecoalescer 152 may be implemented using the off-chip storage 160 and/or the on-chip storage 162, for example. - In an embodiment of the invention, a new flow may be detected at some point during the flow lifetime. The flow state is unknown when the new flow is detected and the first packets are utilized to update the flow state until the flow is known to be in-order. A device performing TTO may also support other offload types, for example, TOE, RDMA, or iSCSI offload. In this case, the FLT for TTO may be shared with the connection search for other offload types with each entry in the FLT indicating the offload type for that flow. Packets that belong to flows of other offload types may not be candidates for TTO. Upon detecting a new flow, the flow may be initiated with the basic initialization context. An entry in the FLT with a flow ID may be created.
- In another embodiment of the invention, a plurality of segments of the same flow may be aggregated in TTO up to a receive aggregation length (RAL), presenting to the host a bigger segment for processing. If aggregation is allowed, the received packet may be placed in the
host memory 126 but will not be delivered to the host. Instead, thehost processor 124 may update the context of the flow this packet belongs to. The new incoming packet may either cause the packet to be delivered immediately alone if there were no prior aggregated packets that were not delivered or as a single packet that represents both that packet and the previously received packets. In another embodiment of the invention, the packet may not be delivered but may update the flow's context. - A termination event may occur and the packet may not be aggregated if at least one of the following occurs at the TCP level: (1) the data is not in-order as derived from the received sequence number (SN) and the flow's context; (2) at least one packet with TCP flags other than ACK flag, for example, a PUSH flag is detected; (3) at least one packet with selective acknowledgement (SACK) information is detected; or (4) if the ACK SN received is bigger than the delivered ACK SN, and requires stopping the aggregation. Similarly, a termination event may occur and the packet may not be aggregated if at least one of the following occurs at the IP level: (1) the type of service (TOS) field in the IP header is different than the TOS field of the previous packets that were aggregated; or (2) the received packet is an IP fragment.
- When aggregating a plurality of packets to a single packet, the aggregated packet's header may contain the aggregated header of all the individual packets it contains. In an exemplary embodiment of the invention, a plurality of TCP rules for the aggregation may be as follows. For example, (1) the SN in the aggregated header is the SN of the first or oldest packet; (2) the ACK SN is the SN of the last or youngest segment; (3) the length of the aggregated header is the sum of the lengths of all the aggregated packets; (4) the window in the aggregated header is the window received in the last or youngest aggregated packet; (5) the time stamp (TS) in the aggregated header is the TS received in the first or oldest aggregated packet; (6) the TS-echo in the aggregated header is the TS-echo received in the first or oldest aggregated packet; and (7) the checksum in the aggregated header is the accumulated checksum of all aggregated packets.
- In an exemplary embodiment of the invention, a plurality of IP field aggregation rules may be provided. For example, (1) the TOS of the aggregated header may be that of all the aggregated packets; (2) the time-to-live (TTL) of the aggregated header is the minimum of all incoming TTLs; (3) the length of the aggregated header is the sum of the lengths in the aggregated packets; (4) the fragment offset of the aggregated header may be zero for aggregated packets; and (5) the packet ID of the aggregated header is the last ID received.
- The received packets may be aggregated until the received packet cannot be aggregated due to the occurrence of a termination event, or if a timeout has expired on that flow, or if the aggregated packet exceeds RAL. The timeout may be implemented by setting a timeout to a value, timeout aggregation value, when the first packet on a flow is placed without delivery. The following packets that are aggregated may not change the timeout. When the packets are delivered due to timeout expiration the timeout may be canceled and may be set again in the next first packet that is not delivered. Notwithstanding, other embodiments of the invention may provide timeout implementation by periodically scanning all the flows.
- In an exemplary embodiment of the invention, the received ACK SN may be relevant to determine the rules to aggregate pure ACKs and to determine the rules to stop aggregation of packets with data due to the received ACK SN. The duplicated pure ACKs may never be aggregated. When duplicated pure ACKs are received, they may cause prior aggregated packets to be delivered and the pure ACK may be delivered immediately separately. The received ACK SN may also be utilized to stop the aggregation and deliver the pending aggregated packet to the host TCP/IP stack.
- In an exemplary embodiment of the invention, a plurality of rules may be provided for stopping the aggregation according to the ACK SN. For example, (1) if the number of acknowledged (ACKed) bytes that are not yet delivered, taking into account the received segments and the prior segments that were not delivered exceeds a threshold, ReceiveAckedBytesAggretation, for example, in bytes; or (2) the time from the arrival of the first packet that advanced the received ACK SN exceeds a threshold, TimeoutAckAggregation, for example. For this purpose, a second timer per flow may be required or other mechanisms, such as periodically scanning the flows may be implemented.
- In another exemplary embodiment of the invention, the flows may be removed from the host memory if one of the following occurs: (1) a reset (RST) flag was detected in the receive side; (2) a finish (FIN) flag was detected in the receive side; (3) there was no receive activity on the flow for a predefined time TerminateNoActivityTime, for example; (4) a KeepAlive packet in the receive direction was not acknowledged. A least recently used (LRU) cache may be used instead of a timeout rule to remove the flows from the host memory.
- In another exemplary embodiment of the invention, the flows may be removed from the host memory if the flow was closed due to a retransmission timeout that requires information from the transmitter. In one exemplary embodiment of the invention, retransmission timeout may comprise periodically scanning all the flows to determine if any flow is closed. The period for scanning may be low, for example, 5 seconds. In each scan, if there is unacknowledged data that was transmitted by the
NIC 128 the maximum transmitted sequence number (SN) may be recorded. Additionally, if there is unacknowledged data that was transmitted by the peer side, the maximum received SN may be recorded. If in two consequent scans there is pending data on same flow of the same type with the recorded number unchanged, pending data that was not acknowledged for the entire scan period may be indicated. In this case the flow may be removed. -
FIG. 2A is a diagram illustrating exemplary steps that may be utilized for handling out-of-order data when a packet P3 and a packet P4 arrive out-of-order with respect to the order of transmission, in accordance with an embodiment of the invention. RegardingFIG. 2A , the packets P3 and P4 may arrive in-order with respect to each other at theNIC 128 but before the arrival of a packet P2, as shown in the actual receive RX traffic pattern 200. The packets P3 and P4 may correspond to a fourth packet and a fifth packet within anisle 211, respectively, in a TCP transmission sequence. In this case, there is a gap or time interval between the end of the packet P1 and the beginning of the packet P3 in the actual receive RX traffic pattern 200. A first disjoint portion in the TCP transmission sequence may result from the arrival of the packets P3 and P4 as shown in the TCP receive sequence space 202 after theisle 213 comprising packets P0 and P1. The rightmost portion ofisle 211 rcv_nxt_R may be represented as (rcv_nxt_L+(length of isle)), where rcv_nxt_L is the leftmost portion ofisle 211 and the length of isle is the sum of the lengths of packets P3 and P4. -
FIG. 2B is a state diagram illustrating exemplary transparent TCP offload with transmit-receive coupling, in accordance with an embodiment of the invention. Referring toFIG. 2B , there is shown a plurality of exemplary flow states, namely, inorder state 226, Out-Of-Order (OOO)state 224, or anunknown state 222. Intransition state 228, theunknown state 222 may be detected for a flow for which a 3-way TCP handshake has not been detected or at some point in the life of the flow other than the initialization phase. The offload engine may also track outgoing and incoming TCP segments with a set synchronous (SYN) flag to detect a new flow. The exemplary transition states may be implemented as a state machine. - The TCP 3-way handshake begins with a synchronize (SYN) segment containing an initial send sequence number (ISN) being chosen by, and sent from a first host. This sequence number may be the starting sequence number of the data in that packet and may increment for every byte of data sent within the segment. When the second host receives the SYN with the sequence number, it may transmit a SYN segment with its own totally independent ISN number in the sequence number field along with an acknowledgment field. This acknowledgment (ACK) field may inform the recipient that its data was received at the other end and it expects the next segment of data bytes to be sent, and may be referred to as the SYN-ACK. When the first host receives this SYN-ACK segment it may send an ACK segment containing the next sequence number, called forward acknowledgement and is received by the second host. The ACK segment may be identified by the ACK field being set. Segments that are not acknowledged within a certain period of time may be retransmitted.
- When a flow is transparent TCP offloaded, the flow may not move from the in
order state 226 andOOO state 224 to theunknown state 222 unless it gets removed and detected again. Intransition state 230, the state diagram may track the out-of-order isle sequence number boundaries using, for example, the parameters rcv_nxt_R and rcv_nxt_L as illustrated inFIG. 2C . The first ingress segment may be referred to as an isle, for example, isle 213 (FIG. 2C ) and the ordering state may be set toOOO state 224. The rightmost portion of isle rcv_nxt_R may be represented as (rcv_nxt_L+(length of isle)), where rcv_nxt_L is the leftmost portion of isle and the length of isle is the sum of the lengths of packets in the isle. - In
transition state 232, theNIC 128 may access the local stack acknowledgment information as the transmitter and receiver are coupled. The ordering state may be modified fromOOO state 224 to in-order state 226 whenever an egress ACK sequence number is greater than an isle length of at least one TCP segment. - In
transition state 234, the initial ordering state may be set to the inorder state 226, if the new flow is detected with the TCP 3-way handshake. Intransition state 236, the rcv_nxt_R may be utilized to check the condition of ingress packets according to the following algorithm.If (in_packet_sn==rcv_nxt_R) // when isle is increased update rcv_nxt_L rcv_nxt_R = in_packet_sn + in_packet_len - In
transition state 238, the ordering state may be modified from inorder state 226 toOOO state 224 if the isle length is not equal to rcv_nxt_R. The value of rcv_nxt_R may be used to check the condition of ingress packets according to the following algorithm.If (in_packet_sn != rcv_nxt_R) rcv_nxt_L = in_packet_sn rcv_nxt_R = in_packet_sn + in_packet_len change state to OOO 224. - In
transition state 240, duringOOO state 224, the boundaries of the highest OOO isle may be tracked for every ingress packet using the following exemplary algorithm.If (in_packet_sn==rcv_nxt_R) // when the isle is increased update rcv_nxt_L rcv_nxt_R = in_packet_sn + in_packet_len else if (in_packet_sn > rcv_nxt_R) // when a new higher isle is generated rcv_nxt_R = in_packet_sn + in_packet_len rcv_nxt_L = in_packet_sn - In another embodiment of the invention, the number of ACKed bytes that have not yet been delivered may exceed a fraction of the pending transmitted bytes that were not ACKed. The pending transmitted bytes count may be calculated as the difference between sndMax, the most advance sequence number (SN) that was transmitted and the last received ACK SN that was delivered.
- In another embodiment of the invention, the number of ACKed bytes may exceed a dynamic threshold. This threshold may depend on the size of the packets that were transmitted to the peer. The sizes of the transmitted packets or the size of the transmission units that were sent to the chip to be transmitted and were not yet ACKed may be recorded. In case of LSO, the aggregation may be set to ACK blocks of data similar to the transmitted data units.
-
FIG. 3 is a flow chart illustrating exemplary steps for transparent TCP offload with per flow estimation of far end transmit window, in accordance with an embodiment of the invention. Referring toFIG. 3 , instep 302, each of the received TCP segments and the transmitted TCP segments may be monitored to determine which network flow they belong to based on their respective header information. Instep 303, it may be determined whether the received TCP segments are for a particular network flow. If the received TCP segments are not for a particular network flow, control passes to step 302. If the received TCP segments are for a particular network flow, control passes to step 304. Instep 304, a network interface card (NIC)processor 130 may enable collection of at least one received TCP segment for a determined network flow. Instep 306, theNIC 128 enables storage of state information for the received TCP segment and state information for transmitted TCP segments for the determined network flow without transferring state information for the received TCP segment and the state information for the transmitted TCP segments to ahost system 124 communicatively coupled to theNIC 128. Instep 308, theNIC 128 may determine the time period over which the received TCP segments are aggregated before transmitting them to thehost system 124. The time period for aggregation may be a minimum of a time period for a termination event to occur and a time period for the far-end effective transmit window. Instep 310, the far-end effective transmit window may be calculated as a maximum value of (rcv_nxt_R-ack_sn) observed since the flow started in an in-order state, where rcv_nxt_R may represent the sequence number of the next expected TCP segment and ack_sn may represent a sequence number of the next received acknowledgement packet from the host system. - In
step 312, a new TCP segment may be generated by aggregating at least a portion of a plurality of the collected TCP segments for the determined network flow over the determined period of time. Instep 314, theNIC 128 enables communication of the generated new TCP segment, new state information for the new TCP segment, and the state information for the transmitted TCP segments to thehost system 124 for TCP offload. - Another embodiment of the invention may provide a machine-readable storage, having stored thereon, a computer program having at least one code section executable by a machine, thereby causing the machine to perform the steps as described above for transparent TCP offload with per flow estimation of a far end transmit window.
- Accordingly, the present invention may be realized in hardware, software, or a combination of hardware and software. The present invention may be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
- The present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
- While the present invention has been described with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present invention without departing from its scope. Therefore, it is intended that the present invention not be limited to the particular embodiment disclosed, but that the present invention will include all embodiments falling within the scope of the appended claims.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/489,393 US20070014246A1 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with per flow estimation of a far end transmit window |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US70054405P | 2005-07-18 | 2005-07-18 | |
US11/489,393 US20070014246A1 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with per flow estimation of a far end transmit window |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070014246A1 true US20070014246A1 (en) | 2007-01-18 |
Family
ID=38163302
Family Applications (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/489,393 Abandoned US20070014246A1 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with per flow estimation of a far end transmit window |
US11/489,388 Active 2028-07-04 US7684344B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload |
US11/489,078 Expired - Fee Related US8064459B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with transmit and receive coupling |
US11/489,389 Expired - Fee Related US7693138B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with best effort direct placement of incoming traffic |
US11/489,390 Abandoned US20070033301A1 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with dynamic zero copy sending |
US12/728,983 Expired - Fee Related US8274976B2 (en) | 2005-07-18 | 2010-03-22 | Method and system for transparent TCP offload |
US12/754,016 Expired - Fee Related US8416768B2 (en) | 2005-07-18 | 2010-04-05 | Method and system for transparent TCP offload with best effort direct placement of incoming traffic |
Family Applications After (6)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/489,388 Active 2028-07-04 US7684344B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload |
US11/489,078 Expired - Fee Related US8064459B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with transmit and receive coupling |
US11/489,389 Expired - Fee Related US7693138B2 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with best effort direct placement of incoming traffic |
US11/489,390 Abandoned US20070033301A1 (en) | 2005-07-18 | 2006-07-18 | Method and system for transparent TCP offload with dynamic zero copy sending |
US12/728,983 Expired - Fee Related US8274976B2 (en) | 2005-07-18 | 2010-03-22 | Method and system for transparent TCP offload |
US12/754,016 Expired - Fee Related US8416768B2 (en) | 2005-07-18 | 2010-04-05 | Method and system for transparent TCP offload with best effort direct placement of incoming traffic |
Country Status (5)
Country | Link |
---|---|
US (7) | US20070014246A1 (en) |
EP (1) | EP1917782A2 (en) |
KR (1) | KR100973201B1 (en) |
CN (1) | CN101253745B (en) |
WO (1) | WO2007069095A2 (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080109562A1 (en) * | 2006-11-08 | 2008-05-08 | Hariramanathan Ramakrishnan | Network Traffic Controller (NTC) |
US20140089467A1 (en) * | 2012-09-27 | 2014-03-27 | Andre Beck | Content stream delivery using pre-loaded segments |
US20140233588A1 (en) * | 2013-02-21 | 2014-08-21 | Applied Micro Circuits Corporation | Large receive offload functionality for a system on chip |
US8935406B1 (en) * | 2007-04-16 | 2015-01-13 | Chelsio Communications, Inc. | Network adaptor configured for connection establishment offload |
WO2015048326A1 (en) | 2013-09-26 | 2015-04-02 | Acelio, Inc. | System and method for improving tcp performance in virtualized environments |
US20150263968A1 (en) * | 2014-03-11 | 2015-09-17 | Vmware, Inc. | Snooping forwarded packets by a virtual machine |
US9384033B2 (en) | 2014-03-11 | 2016-07-05 | Vmware, Inc. | Large receive offload for virtual machines |
US9742682B2 (en) | 2014-03-11 | 2017-08-22 | Vmware, Inc. | Large receive offload for virtual machines |
US9906454B2 (en) | 2014-09-17 | 2018-02-27 | AppFormix, Inc. | System and method for providing quality of service to data center applications by controlling the rate at which data packets are transmitted |
US10291472B2 (en) | 2015-07-29 | 2019-05-14 | AppFormix, Inc. | Assessment of operational states of a computing environment |
US10313926B2 (en) | 2017-05-31 | 2019-06-04 | Nicira, Inc. | Large receive offload (LRO) processing in virtualized computing environments |
US10355997B2 (en) | 2013-09-26 | 2019-07-16 | Appformix Inc. | System and method for improving TCP performance in virtualized environments |
WO2019219184A1 (en) * | 2018-05-16 | 2019-11-21 | Huawei Technologies Co., Ltd. | Receiving device and transmitting device for tcp communication |
US10581687B2 (en) | 2013-09-26 | 2020-03-03 | Appformix Inc. | Real-time cloud-infrastructure policy implementation and management |
US20200120190A1 (en) * | 2019-12-12 | 2020-04-16 | Linden Cornett | Semi-flexible packet coalescing control path |
WO2020135567A1 (en) * | 2018-12-28 | 2020-07-02 | Alibaba Group Holding Limited | Offload controller control of programmable switch |
US10862805B1 (en) | 2018-07-31 | 2020-12-08 | Juniper Networks, Inc. | Intelligent offloading of services for a network device |
US10868742B2 (en) | 2017-03-29 | 2020-12-15 | Juniper Networks, Inc. | Multi-cluster dashboard for distributed virtualization infrastructure element monitoring and policy control |
US11068314B2 (en) | 2017-03-29 | 2021-07-20 | Juniper Networks, Inc. | Micro-level monitoring, visibility and control of shared resources internal to a processor of a host machine for a virtual environment |
CN113498586A (en) * | 2019-02-15 | 2021-10-12 | 高通股份有限公司 | Method and apparatus for transport protocol ACK aggregation |
US11323327B1 (en) | 2017-04-19 | 2022-05-03 | Juniper Networks, Inc. | Virtualization infrastructure element monitoring and policy control in a cloud environment using profiles |
US20230029796A1 (en) * | 2020-04-17 | 2023-02-02 | Huawei Technologies Co., Ltd. | Stateful service processing method and apparatus |
US20230103738A1 (en) * | 2021-10-04 | 2023-04-06 | Nxp B.V. | Coalescing interrupts based on fragment information in packets and a network controller for coalescing |
Families Citing this family (118)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8724485B2 (en) | 2000-08-30 | 2014-05-13 | Broadcom Corporation | Home network system and method |
ATE485650T1 (en) | 2000-08-30 | 2010-11-15 | Tmt Coaxial Networks Inc | METHOD AND SYSTEM FOR A HOME NETWORK |
US9094226B2 (en) * | 2000-08-30 | 2015-07-28 | Broadcom Corporation | Home network system and method |
US7978598B1 (en) * | 2002-03-01 | 2011-07-12 | Cisco Technology, Inc. | Connection replication |
US7007103B2 (en) * | 2002-04-30 | 2006-02-28 | Microsoft Corporation | Method to offload a network stack |
US7616664B2 (en) * | 2005-02-18 | 2009-11-10 | Hewlett-Packard Development Company, L.P. | System and method of sending video and audio data over a network |
US20070014246A1 (en) * | 2005-07-18 | 2007-01-18 | Eliezer Aloni | Method and system for transparent TCP offload with per flow estimation of a far end transmit window |
US7660306B1 (en) | 2006-01-12 | 2010-02-09 | Chelsio Communications, Inc. | Virtualizing the operation of intelligent network interface circuitry |
US7616563B1 (en) | 2005-08-31 | 2009-11-10 | Chelsio Communications, Inc. | Method to implement an L4-L7 switch using split connections and an offloading NIC |
US7660264B1 (en) | 2005-12-19 | 2010-02-09 | Chelsio Communications, Inc. | Method for traffic schedulign in intelligent network interface circuitry |
US7724658B1 (en) | 2005-08-31 | 2010-05-25 | Chelsio Communications, Inc. | Protocol offload transmit traffic management |
US7506080B2 (en) * | 2005-09-16 | 2009-03-17 | Inter Corporation | Parallel processing of frame based data transfers |
US7735099B1 (en) * | 2005-12-23 | 2010-06-08 | Qlogic, Corporation | Method and system for processing network data |
EP1835692B1 (en) * | 2006-03-13 | 2018-08-08 | Telefonaktiebolaget LM Ericsson (publ) | Method and system for distributing service messages from clients to service applications |
EP2583744A1 (en) | 2006-03-31 | 2013-04-24 | Genencor International, Inc. | Permeate product of tangential flow filtration process |
US20080120426A1 (en) * | 2006-11-17 | 2008-05-22 | International Business Machines Corporation | Selective acceleration of transport control protocol (tcp) connections |
US7782850B2 (en) * | 2006-11-20 | 2010-08-24 | Broadcom Corporation | MAC to PHY interface apparatus and methods for transmission of packets through a communications network |
US7742495B2 (en) * | 2006-11-20 | 2010-06-22 | Broadcom Corporation | System and method for retransmitting packets over a network of communication channels |
US8090043B2 (en) | 2006-11-20 | 2012-01-03 | Broadcom Corporation | Apparatus and methods for compensating for signal imbalance in a receiver |
US20080133654A1 (en) * | 2006-12-01 | 2008-06-05 | Chei-Yol Kim | Network block device using network asynchronous i/o |
US7849214B2 (en) * | 2006-12-04 | 2010-12-07 | Electronics And Telecommunications Research Institute | Packet receiving hardware apparatus for TCP offload engine and receiving system and method using the same |
US8161532B2 (en) * | 2007-04-04 | 2012-04-17 | Microsoft Corporation | Operating system independent architecture for subscription computing |
US8589587B1 (en) * | 2007-05-11 | 2013-11-19 | Chelsio Communications, Inc. | Protocol offload in intelligent network adaptor, including application level signalling |
US8060644B1 (en) * | 2007-05-11 | 2011-11-15 | Chelsio Communications, Inc. | Intelligent network adaptor with end-to-end flow control |
US8345553B2 (en) | 2007-05-31 | 2013-01-01 | Broadcom Corporation | Apparatus and methods for reduction of transmission delay in a communication network |
US7715362B1 (en) * | 2007-11-23 | 2010-05-11 | Juniper Networks, Inc. | Identification fragment handling |
KR100936918B1 (en) * | 2007-12-17 | 2010-01-18 | 한국전자통신연구원 | Static file transfer system call processing TOE device and method |
US20100017513A1 (en) * | 2008-07-16 | 2010-01-21 | Cray Inc. | Multiple overlapping block transfers |
US8341286B1 (en) * | 2008-07-31 | 2012-12-25 | Alacritech, Inc. | TCP offload send optimization |
US9112717B2 (en) | 2008-07-31 | 2015-08-18 | Broadcom Corporation | Systems and methods for providing a MoCA power management strategy |
US8254413B2 (en) | 2008-12-22 | 2012-08-28 | Broadcom Corporation | Systems and methods for physical layer (“PHY”) concatenation in a multimedia over coax alliance network |
US8238227B2 (en) * | 2008-12-22 | 2012-08-07 | Broadcom Corporation | Systems and methods for providing a MoCA improved performance for short burst packets |
US8213309B2 (en) | 2008-12-22 | 2012-07-03 | Broadcom Corporation | Systems and methods for reducing latency and reservation request overhead in a communications network |
US20100165838A1 (en) * | 2008-12-30 | 2010-07-01 | Yury Bakshi | Method and apparatus for improving data throughput in a network |
US20100238932A1 (en) * | 2009-03-19 | 2010-09-23 | Broadcom Corporation | Method and apparatus for enhanced packet aggregation |
US8553547B2 (en) | 2009-03-30 | 2013-10-08 | Broadcom Corporation | Systems and methods for retransmitting packets over a network of communication channels |
US20100254278A1 (en) | 2009-04-07 | 2010-10-07 | Broadcom Corporation | Assessment in an information network |
US8730798B2 (en) | 2009-05-05 | 2014-05-20 | Broadcom Corporation | Transmitter channel throughput in an information network |
US8867355B2 (en) | 2009-07-14 | 2014-10-21 | Broadcom Corporation | MoCA multicast handling |
US8942250B2 (en) | 2009-10-07 | 2015-01-27 | Broadcom Corporation | Systems and methods for providing service (“SRV”) node selection |
US9015318B1 (en) | 2009-11-18 | 2015-04-21 | Cisco Technology, Inc. | System and method for inspecting domain name system flows in a network environment |
US9009293B2 (en) * | 2009-11-18 | 2015-04-14 | Cisco Technology, Inc. | System and method for reporting packet characteristics in a network environment |
US9148380B2 (en) * | 2009-11-23 | 2015-09-29 | Cisco Technology, Inc. | System and method for providing a sequence numbering mechanism in a network environment |
US9535732B2 (en) * | 2009-11-24 | 2017-01-03 | Red Hat Israel, Ltd. | Zero copy transmission in virtualization environment |
US8737262B2 (en) * | 2009-11-24 | 2014-05-27 | Red Hat Israel, Ltd. | Zero copy transmission with raw packets |
EP2509000A4 (en) * | 2009-12-04 | 2017-09-20 | Nec Corporation | Server and flow control program |
US8792495B1 (en) | 2009-12-19 | 2014-07-29 | Cisco Technology, Inc. | System and method for managing out of order packets in a network environment |
US8611327B2 (en) | 2010-02-22 | 2013-12-17 | Broadcom Corporation | Method and apparatus for policing a QoS flow in a MoCA 2.0 network |
US8514860B2 (en) | 2010-02-23 | 2013-08-20 | Broadcom Corporation | Systems and methods for implementing a high throughput mode for a MoCA device |
US8428087B1 (en) * | 2010-09-17 | 2013-04-23 | Amazon Technologies, Inc. | Framework for stateless packet tunneling |
US9235474B1 (en) * | 2011-02-17 | 2016-01-12 | Axcient, Inc. | Systems and methods for maintaining a virtual failover volume of a target computing system |
US8924360B1 (en) | 2010-09-30 | 2014-12-30 | Axcient, Inc. | Systems and methods for restoring a file |
US8954544B2 (en) | 2010-09-30 | 2015-02-10 | Axcient, Inc. | Cloud-based virtual machines and offices |
US9705730B1 (en) | 2013-05-07 | 2017-07-11 | Axcient, Inc. | Cloud storage using Merkle trees |
US8589350B1 (en) | 2012-04-02 | 2013-11-19 | Axcient, Inc. | Systems, methods, and media for synthesizing views of file system backups |
US10284437B2 (en) | 2010-09-30 | 2019-05-07 | Efolder, Inc. | Cloud-based virtual machines and offices |
US8787303B2 (en) | 2010-10-05 | 2014-07-22 | Cisco Technology, Inc. | Methods and apparatus for data traffic offloading at a router |
JP5204195B2 (en) * | 2010-10-29 | 2013-06-05 | 株式会社東芝 | Data transmission system and data transmission program |
US8495262B2 (en) * | 2010-11-23 | 2013-07-23 | International Business Machines Corporation | Using a table to determine if user buffer is marked copy-on-write |
CN102111403B (en) * | 2010-12-17 | 2014-05-21 | 曙光信息产业(北京)有限公司 | Method and device for acquiring transmission control protocol (TCP) connection data at high speed |
CN102075416B (en) * | 2010-12-17 | 2014-07-30 | 曙光信息产业(北京)有限公司 | Method for realizing TCP (transmission control protocol) connection data buffer by combining software and hardware |
CN102082688B (en) * | 2010-12-17 | 2014-08-13 | 曙光信息产业(北京)有限公司 | Method for realizing management of TCP (transmission control protocol) out-of-order buffer by means of combination of software and hardware |
US9003057B2 (en) | 2011-01-04 | 2015-04-07 | Cisco Technology, Inc. | System and method for exchanging information in a mobile wireless network environment |
US8891532B1 (en) * | 2011-05-17 | 2014-11-18 | Hitachi Data Systems Engineering UK Limited | System and method for conveying the reason for TCP reset in machine-readable form |
US8589610B2 (en) | 2011-05-31 | 2013-11-19 | Oracle International Corporation | Method and system for receiving commands using a scoreboard on an infiniband host channel adaptor |
US8490207B2 (en) * | 2011-05-31 | 2013-07-16 | Red Hat, Inc. | Performing zero-copy sends in a networked file system with cryptographic signing |
US8484392B2 (en) | 2011-05-31 | 2013-07-09 | Oracle International Corporation | Method and system for infiniband host channel adaptor quality of service |
US8804752B2 (en) | 2011-05-31 | 2014-08-12 | Oracle International Corporation | Method and system for temporary data unit storage on infiniband host channel adaptor |
US8948013B1 (en) | 2011-06-14 | 2015-02-03 | Cisco Technology, Inc. | Selective packet sequence acceleration in a network environment |
US8743690B1 (en) | 2011-06-14 | 2014-06-03 | Cisco Technology, Inc. | Selective packet sequence acceleration in a network environment |
US8737221B1 (en) | 2011-06-14 | 2014-05-27 | Cisco Technology, Inc. | Accelerated processing of aggregate data flows in a network environment |
US8792353B1 (en) | 2011-06-14 | 2014-07-29 | Cisco Technology, Inc. | Preserving sequencing during selective packet acceleration in a network environment |
US8688799B2 (en) | 2011-06-30 | 2014-04-01 | Nokia Corporation | Methods, apparatuses and computer program products for reducing memory copy overhead by indicating a location of requested data for direct access |
US9021123B2 (en) | 2011-08-23 | 2015-04-28 | Oracle International Corporation | Method and system for responder side cut through of received data |
US8879579B2 (en) | 2011-08-23 | 2014-11-04 | Oracle International Corporation | Method and system for requester virtual cut through |
US8832216B2 (en) | 2011-08-31 | 2014-09-09 | Oracle International Corporation | Method and system for conditional remote direct memory access write |
EP2595351A1 (en) * | 2011-11-15 | 2013-05-22 | Eaton Industries GmbH | Device for a digital transfer system, digital transfer system and data transfer method |
US8996718B2 (en) * | 2012-02-02 | 2015-03-31 | Apple Inc. | TCP-aware receive side coalescing |
US9155046B2 (en) * | 2012-09-12 | 2015-10-06 | Intel Corporation | Optimizing semi-active workloads |
US20140092754A1 (en) * | 2012-09-28 | 2014-04-03 | Fluke Corporation | Packet tagging mechanism |
US9785647B1 (en) | 2012-10-02 | 2017-10-10 | Axcient, Inc. | File system virtualization |
US9852140B1 (en) | 2012-11-07 | 2017-12-26 | Axcient, Inc. | Efficient file replication |
US9148352B2 (en) | 2012-12-20 | 2015-09-29 | Oracle International Corporation | Method and system for dynamic repurposing of payload storage as a trace buffer |
US9069633B2 (en) | 2012-12-20 | 2015-06-30 | Oracle America, Inc. | Proxy queue pair for offloading |
US9384072B2 (en) | 2012-12-20 | 2016-07-05 | Oracle International Corporation | Distributed queue pair state on a host channel adapter |
US9191452B2 (en) | 2012-12-20 | 2015-11-17 | Oracle International Corporation | Method and system for an on-chip completion cache for optimized completion building |
US9256555B2 (en) | 2012-12-20 | 2016-02-09 | Oracle International Corporation | Method and system for queue descriptor cache management for a host channel adapter |
US9069485B2 (en) | 2012-12-20 | 2015-06-30 | Oracle International Corporation | Doorbell backpressure avoidance mechanism on a host channel adapter |
US8937949B2 (en) | 2012-12-20 | 2015-01-20 | Oracle International Corporation | Method and system for Infiniband host channel adapter multicast packet replication mechanism |
US9069705B2 (en) | 2013-02-26 | 2015-06-30 | Oracle International Corporation | CAM bit error recovery |
US8850085B2 (en) | 2013-02-26 | 2014-09-30 | Oracle International Corporation | Bandwidth aware request throttling |
US9336158B2 (en) | 2013-02-26 | 2016-05-10 | Oracle International Corporation | Method and system for simplified address translation support for static infiniband host channel adaptor structures |
US9397907B1 (en) | 2013-03-07 | 2016-07-19 | Axcient, Inc. | Protection status determinations for computing devices |
US9292153B1 (en) | 2013-03-07 | 2016-03-22 | Axcient, Inc. | Systems and methods for providing efficient and focused visualization of data |
US9338918B2 (en) | 2013-07-10 | 2016-05-10 | Samsung Electronics Co., Ltd. | Socket interposer and computer system using the socket interposer |
GB2519745B (en) * | 2013-10-22 | 2018-04-18 | Canon Kk | Method of processing disordered frame portion data units |
CN105578524B (en) * | 2014-10-07 | 2019-01-25 | 国基电子(上海)有限公司 | Terminal device and method for processing packet |
US9838498B2 (en) * | 2014-10-30 | 2017-12-05 | ScaleFlux | Remote direct non-volatile cache access |
US10298494B2 (en) | 2014-11-19 | 2019-05-21 | Strato Scale Ltd. | Reducing short-packet overhead in computer clusters |
WO2016079626A1 (en) * | 2014-11-19 | 2016-05-26 | Strato Scale Ltd. | Reducing short-packet overhead in computer clusters |
US10212259B2 (en) | 2014-12-01 | 2019-02-19 | Oracle International Corporation | Management of transmission control blocks (TCBs) supporting TCP connection requests in multiprocessing environments |
US10320918B1 (en) * | 2014-12-17 | 2019-06-11 | Xilinx, Inc. | Data-flow architecture for a TCP offload engine |
US9846657B2 (en) | 2015-02-06 | 2017-12-19 | Mediatek Inc. | Electronic device for packing multiple commands in one compound command frame and electronic device for decoding and executing multiple commands packed in one compound command frame |
US9584628B2 (en) | 2015-03-17 | 2017-02-28 | Freescale Semiconductor, Inc. | Zero-copy data transmission system |
JP2017046325A (en) * | 2015-08-28 | 2017-03-02 | 株式会社東芝 | COMMUNICATION DEVICE, COMMUNICATION METHOD, AND PROGRAM |
US9954979B2 (en) * | 2015-09-21 | 2018-04-24 | International Business Machines Corporation | Protocol selection for transmission control protocol/internet protocol (TCP/IP) |
CN105871739B (en) * | 2016-06-17 | 2018-12-07 | 华为技术有限公司 | A kind of method and calculating equipment of processing message |
US10237183B2 (en) * | 2016-10-03 | 2019-03-19 | Guavus, Inc. | Detecting tethering in networks |
US11855898B1 (en) * | 2018-03-14 | 2023-12-26 | F5, Inc. | Methods for traffic dependent direct memory access optimization and devices thereof |
US10798014B1 (en) * | 2019-04-05 | 2020-10-06 | Arista Networks, Inc. | Egress maximum transmission unit (MTU) enforcement |
CN110535827B (en) * | 2019-07-17 | 2021-08-24 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Method and system for fully offloading IP core of TCP protocol for realizing multi-connection management |
US11405324B1 (en) * | 2019-12-12 | 2022-08-02 | Amazon Technologies, Inc. | Packet serial number validation |
KR20210137702A (en) * | 2020-05-11 | 2021-11-18 | 삼성전자주식회사 | Electronic device and method for processing a data packet received in the electronic device |
CN112953967A (en) * | 2021-03-30 | 2021-06-11 | 扬州万方电子技术有限责任公司 | Network protocol unloading device and data transmission system |
US12279152B2 (en) * | 2022-08-18 | 2025-04-15 | Apple Inc. | Dynamic L2 buffer scaling |
TWI820977B (en) | 2022-10-21 | 2023-11-01 | 中原大學 | Packet sorting and reassembly circuit module |
US20240348676A1 (en) * | 2023-04-11 | 2024-10-17 | Cisco Technology, Inc. | Application-centric web protocol-based data storage |
CN117354400B (en) * | 2023-12-06 | 2024-02-02 | 商飞软件有限公司 | Acquisition and analysis service system for Beidou short message |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5673031A (en) * | 1988-08-04 | 1997-09-30 | Norand Corporation | Redundant radio frequency network having a roaming terminal communication protocol |
US6434620B1 (en) * | 1998-08-27 | 2002-08-13 | Alacritech, Inc. | TCP/IP offload network interface device |
US6628617B1 (en) * | 1999-03-03 | 2003-09-30 | Lucent Technologies Inc. | Technique for internetworking traffic on connectionless and connection-oriented networks |
US20040042483A1 (en) * | 2002-08-30 | 2004-03-04 | Uri Elzur | System and method for TCP offload |
US20040054813A1 (en) * | 1997-10-14 | 2004-03-18 | Alacritech, Inc. | TCP offload network interface device |
US20040095883A1 (en) * | 2002-11-18 | 2004-05-20 | Chu Hsiao-Keng J. | Method and system for TCP large segment offload with ack-based transmit scheduling |
US6961539B2 (en) * | 2001-08-09 | 2005-11-01 | Hughes Electronics Corporation | Low latency handling of transmission control protocol messages in a broadband satellite communications system |
US20060133278A1 (en) * | 2004-12-03 | 2006-06-22 | Microsoft Corporation | Efficient transfer of messages using reliable messaging protocols for web services |
US7275093B1 (en) * | 2000-04-26 | 2007-09-25 | 3 Com Corporation | Methods and device for managing message size transmitted over a network |
Family Cites Families (51)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5727142A (en) * | 1996-05-03 | 1998-03-10 | International Business Machines Corporation | Method for a non-disruptive host connection switch after detection of an error condition or during a host outage or failure |
US5778414A (en) * | 1996-06-13 | 1998-07-07 | Racal-Datacom, Inc. | Performance enhancing memory interleaver for data frame processing |
US5940404A (en) * | 1997-04-30 | 1999-08-17 | International Business Machines Corporation | Method and apparatus for enhanced scatter mode allowing user data to be page aligned |
US7174393B2 (en) * | 2000-12-26 | 2007-02-06 | Alacritech, Inc. | TCP/IP offload network interface device |
US6157955A (en) * | 1998-06-15 | 2000-12-05 | Intel Corporation | Packet processing system including a policy engine having a classification unit |
US6347364B1 (en) * | 1998-11-20 | 2002-02-12 | International Business Machines Corp. | Schedulable dynamic memory pinning |
US6952409B2 (en) * | 1999-05-17 | 2005-10-04 | Jolitz Lynne G | Accelerator system and method |
JP2000332817A (en) * | 1999-05-18 | 2000-11-30 | Fujitsu Ltd | Packet processing device |
CN1246012A (en) * | 1999-07-14 | 2000-03-01 | 邮电部武汉邮电科学研究院 | Adaptation method for making internet be compatible with synchronous digital system |
JP2001045061A (en) * | 1999-08-02 | 2001-02-16 | Hitachi Ltd | Communication node device |
US6804239B1 (en) | 1999-08-17 | 2004-10-12 | Mindspeed Technologies, Inc. | Integrated circuit that processes communication packets with co-processor circuitry to correlate a packet stream with context information |
US6799202B1 (en) * | 1999-12-16 | 2004-09-28 | Hachiro Kawaii | Federated operating system for a server |
US6535969B1 (en) * | 2000-06-15 | 2003-03-18 | Lsi Logic Corporation | Method and apparatus for allocating free memory |
US6958997B1 (en) * | 2000-07-05 | 2005-10-25 | Cisco Technology, Inc. | TCP fast recovery extended method and apparatus |
US20020010765A1 (en) * | 2000-07-21 | 2002-01-24 | John Border | Method and system for prioritizing traffic in a network |
US20050203927A1 (en) * | 2000-07-24 | 2005-09-15 | Vivcom, Inc. | Fast metadata generation and delivery |
US20030046330A1 (en) | 2001-09-04 | 2003-03-06 | Hayes John W. | Selective offloading of protocol processing |
US7111162B1 (en) * | 2001-09-10 | 2006-09-19 | Cisco Technology, Inc. | Load balancing approach for scaling secure sockets layer performance |
US7359326B1 (en) * | 2002-02-05 | 2008-04-15 | 3Com Corporation | Method for splitting data and acknowledgements in a TCP session |
US7237031B2 (en) * | 2002-03-07 | 2007-06-26 | Sun Microsystems, Inc. | Method and apparatus for caching protocol processing data |
US7487264B2 (en) * | 2002-06-11 | 2009-02-03 | Pandya Ashish A | High performance IP processor |
US7277963B2 (en) * | 2002-06-26 | 2007-10-02 | Sandvine Incorporated | TCP proxy providing application layer modifications |
US7142540B2 (en) * | 2002-07-18 | 2006-11-28 | Sun Microsystems, Inc. | Method and apparatus for zero-copy receive buffer management |
EP1552408A4 (en) | 2002-08-30 | 2010-10-06 | Broadcom Corp | SYSTEM AND METHOD FOR DELAYING TCP / IP INDEPENDENTLY OF A BANDWIDTH DELAY PRODUCT |
US7397800B2 (en) * | 2002-08-30 | 2008-07-08 | Broadcom Corporation | Method and system for data placement of out-of-order (OOO) TCP segments |
US7299266B2 (en) * | 2002-09-05 | 2007-11-20 | International Business Machines Corporation | Memory management offload for RDMA enabled network adapters |
CN1254065C (en) * | 2002-10-29 | 2006-04-26 | 华为技术有限公司 | Random storage implemented TCP connecting timer and its implementing method |
US8069225B2 (en) * | 2003-04-14 | 2011-11-29 | Riverbed Technology, Inc. | Transparent client-server transaction accelerator |
US7742473B2 (en) * | 2002-11-12 | 2010-06-22 | Mark Adams | Accelerator module |
US7324540B2 (en) * | 2002-12-31 | 2008-01-29 | Intel Corporation | Network protocol off-load engines |
US7330862B1 (en) * | 2003-04-25 | 2008-02-12 | Network Appliance, Inc. | Zero copy write datapath |
WO2004102115A1 (en) | 2003-05-16 | 2004-11-25 | Fujitsu Limited | Angle measuring system |
US20050108518A1 (en) * | 2003-06-10 | 2005-05-19 | Pandya Ashish A. | Runtime adaptable security processor |
US20050021558A1 (en) | 2003-06-11 | 2005-01-27 | Beverly Harlan T. | Network protocol off-load engine memory management |
US7251745B2 (en) * | 2003-06-11 | 2007-07-31 | Availigent, Inc. | Transparent TCP connection failover |
US7359380B1 (en) * | 2003-06-24 | 2008-04-15 | Nvidia Corporation | Network protocol processing for routing and bridging |
US7420991B2 (en) * | 2003-07-15 | 2008-09-02 | Qlogic, Corporation | TCP time stamp processing in hardware based TCP offload |
US8086747B2 (en) * | 2003-09-22 | 2011-12-27 | Anilkumar Dominic | Group-to-group communication over a single connection |
US20050086349A1 (en) * | 2003-10-16 | 2005-04-21 | Nagarajan Subramaniyan | Methods and apparatus for offloading TCP/IP processing using a protocol driver interface filter driver |
US6996070B2 (en) * | 2003-12-05 | 2006-02-07 | Alacritech, Inc. | TCP/IP offload device with reduced sequential processing |
US7383483B2 (en) * | 2003-12-11 | 2008-06-03 | International Business Machines Corporation | Data transfer error checking |
US7441006B2 (en) * | 2003-12-11 | 2008-10-21 | International Business Machines Corporation | Reducing number of write operations relative to delivery of out-of-order RDMA send messages by managing reference counter |
US8195835B2 (en) * | 2004-01-28 | 2012-06-05 | Alcatel Lucent | Endpoint address change in a packet network |
US7826457B2 (en) * | 2004-05-11 | 2010-11-02 | Broadcom Corp. | Method and system for handling out-of-order segments in a wireless system via direct data placement |
US20050286526A1 (en) * | 2004-06-25 | 2005-12-29 | Sood Sanjeev H | Optimized algorithm for stream re-assembly |
US7613813B2 (en) * | 2004-09-10 | 2009-11-03 | Cavium Networks, Inc. | Method and apparatus for reducing host overhead in a socket server implementation |
US7509419B2 (en) * | 2005-01-13 | 2009-03-24 | International Business Machines Corporation | Method for providing remote access redirect capability in a channel adapter of a system area network |
US7535907B2 (en) * | 2005-04-08 | 2009-05-19 | Oavium Networks, Inc. | TCP engine |
US20070014246A1 (en) * | 2005-07-18 | 2007-01-18 | Eliezer Aloni | Method and system for transparent TCP offload with per flow estimation of a far end transmit window |
CA2514039A1 (en) * | 2005-07-28 | 2007-01-28 | Third Brigade Inc. | Tcp normalization engine |
US7596628B2 (en) * | 2006-05-01 | 2009-09-29 | Broadcom Corporation | Method and system for transparent TCP offload (TTO) with a user space library |
-
2006
- 2006-07-18 US US11/489,393 patent/US20070014246A1/en not_active Abandoned
- 2006-07-18 US US11/489,388 patent/US7684344B2/en active Active
- 2006-07-18 EP EP06848596A patent/EP1917782A2/en not_active Withdrawn
- 2006-07-18 US US11/489,078 patent/US8064459B2/en not_active Expired - Fee Related
- 2006-07-18 WO PCT/IB2006/004098 patent/WO2007069095A2/en active Application Filing
- 2006-07-18 KR KR1020087002991A patent/KR100973201B1/en not_active Expired - Fee Related
- 2006-07-18 US US11/489,389 patent/US7693138B2/en not_active Expired - Fee Related
- 2006-07-18 US US11/489,390 patent/US20070033301A1/en not_active Abandoned
- 2006-07-18 CN CN2006800262474A patent/CN101253745B/en not_active Expired - Fee Related
-
2010
- 2010-03-22 US US12/728,983 patent/US8274976B2/en not_active Expired - Fee Related
- 2010-04-05 US US12/754,016 patent/US8416768B2/en not_active Expired - Fee Related
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5673031A (en) * | 1988-08-04 | 1997-09-30 | Norand Corporation | Redundant radio frequency network having a roaming terminal communication protocol |
US20040054813A1 (en) * | 1997-10-14 | 2004-03-18 | Alacritech, Inc. | TCP offload network interface device |
US6434620B1 (en) * | 1998-08-27 | 2002-08-13 | Alacritech, Inc. | TCP/IP offload network interface device |
US6628617B1 (en) * | 1999-03-03 | 2003-09-30 | Lucent Technologies Inc. | Technique for internetworking traffic on connectionless and connection-oriented networks |
US7275093B1 (en) * | 2000-04-26 | 2007-09-25 | 3 Com Corporation | Methods and device for managing message size transmitted over a network |
US6961539B2 (en) * | 2001-08-09 | 2005-11-01 | Hughes Electronics Corporation | Low latency handling of transmission control protocol messages in a broadband satellite communications system |
US20040042483A1 (en) * | 2002-08-30 | 2004-03-04 | Uri Elzur | System and method for TCP offload |
US20040095883A1 (en) * | 2002-11-18 | 2004-05-20 | Chu Hsiao-Keng J. | Method and system for TCP large segment offload with ack-based transmit scheduling |
US20060133278A1 (en) * | 2004-12-03 | 2006-06-22 | Microsoft Corporation | Efficient transfer of messages using reliable messaging protocols for web services |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080109562A1 (en) * | 2006-11-08 | 2008-05-08 | Hariramanathan Ramakrishnan | Network Traffic Controller (NTC) |
US10749994B2 (en) | 2006-11-08 | 2020-08-18 | Standard Microsystems Corporation | Network traffic controller (NTC) |
US9794378B2 (en) | 2006-11-08 | 2017-10-17 | Standard Microsystems Corporation | Network traffic controller (NTC) |
US9537878B1 (en) | 2007-04-16 | 2017-01-03 | Chelsio Communications, Inc. | Network adaptor configured for connection establishment offload |
US8935406B1 (en) * | 2007-04-16 | 2015-01-13 | Chelsio Communications, Inc. | Network adaptor configured for connection establishment offload |
US20140089467A1 (en) * | 2012-09-27 | 2014-03-27 | Andre Beck | Content stream delivery using pre-loaded segments |
US9300578B2 (en) * | 2013-02-21 | 2016-03-29 | Applied Micro Circuits Corporation | Large receive offload functionality for a system on chip |
US10972390B2 (en) * | 2013-02-21 | 2021-04-06 | Ampere Computing Llc | TCP segmentation offload in a server on a chip |
US20140233588A1 (en) * | 2013-02-21 | 2014-08-21 | Applied Micro Circuits Corporation | Large receive offload functionality for a system on chip |
WO2015048326A1 (en) | 2013-09-26 | 2015-04-02 | Acelio, Inc. | System and method for improving tcp performance in virtualized environments |
US12021692B2 (en) | 2013-09-26 | 2024-06-25 | Juniper Networks, Inc. | Policy implementation and management |
US11140039B2 (en) | 2013-09-26 | 2021-10-05 | Appformix Inc. | Policy implementation and management |
US10355997B2 (en) | 2013-09-26 | 2019-07-16 | Appformix Inc. | System and method for improving TCP performance in virtualized environments |
EP3049930A4 (en) * | 2013-09-26 | 2017-05-31 | Appformix Inc. | System and method for improving tcp performance in virtualized environments |
US10581687B2 (en) | 2013-09-26 | 2020-03-03 | Appformix Inc. | Real-time cloud-infrastructure policy implementation and management |
US10116574B2 (en) | 2013-09-26 | 2018-10-30 | Juniper Networks, Inc. | System and method for improving TCP performance in virtualized environments |
US9742682B2 (en) | 2014-03-11 | 2017-08-22 | Vmware, Inc. | Large receive offload for virtual machines |
US20150263968A1 (en) * | 2014-03-11 | 2015-09-17 | Vmware, Inc. | Snooping forwarded packets by a virtual machine |
US9384033B2 (en) | 2014-03-11 | 2016-07-05 | Vmware, Inc. | Large receive offload for virtual machines |
US9755981B2 (en) * | 2014-03-11 | 2017-09-05 | Vmware, Inc. | Snooping forwarded packets by a virtual machine |
US9906454B2 (en) | 2014-09-17 | 2018-02-27 | AppFormix, Inc. | System and method for providing quality of service to data center applications by controlling the rate at which data packets are transmitted |
US9929962B2 (en) | 2014-09-17 | 2018-03-27 | AppFormix, Inc. | System and method to control bandwidth of classes of network traffic using bandwidth limits and reservations |
US11658874B2 (en) | 2015-07-29 | 2023-05-23 | Juniper Networks, Inc. | Assessment of operational states of a computing environment |
US10291472B2 (en) | 2015-07-29 | 2019-05-14 | AppFormix, Inc. | Assessment of operational states of a computing environment |
US10868742B2 (en) | 2017-03-29 | 2020-12-15 | Juniper Networks, Inc. | Multi-cluster dashboard for distributed virtualization infrastructure element monitoring and policy control |
US11888714B2 (en) | 2017-03-29 | 2024-01-30 | Juniper Networks, Inc. | Policy controller for distributed virtualization infrastructure element monitoring |
US11068314B2 (en) | 2017-03-29 | 2021-07-20 | Juniper Networks, Inc. | Micro-level monitoring, visibility and control of shared resources internal to a processor of a host machine for a virtual environment |
US11240128B2 (en) | 2017-03-29 | 2022-02-01 | Juniper Networks, Inc. | Policy controller for distributed virtualization infrastructure element monitoring |
US12021693B1 (en) | 2017-04-19 | 2024-06-25 | Juniper Networks, Inc. | Virtualization infrastructure element monitoring and policy control in a cloud environment using profiles |
US11323327B1 (en) | 2017-04-19 | 2022-05-03 | Juniper Networks, Inc. | Virtualization infrastructure element monitoring and policy control in a cloud environment using profiles |
US10313926B2 (en) | 2017-05-31 | 2019-06-04 | Nicira, Inc. | Large receive offload (LRO) processing in virtualized computing environments |
WO2019219184A1 (en) * | 2018-05-16 | 2019-11-21 | Huawei Technologies Co., Ltd. | Receiving device and transmitting device for tcp communication |
US10862805B1 (en) | 2018-07-31 | 2020-12-08 | Juniper Networks, Inc. | Intelligent offloading of services for a network device |
WO2020135567A1 (en) * | 2018-12-28 | 2020-07-02 | Alibaba Group Holding Limited | Offload controller control of programmable switch |
CN113498586A (en) * | 2019-02-15 | 2021-10-12 | 高通股份有限公司 | Method and apparatus for transport protocol ACK aggregation |
US11916840B2 (en) | 2019-02-15 | 2024-02-27 | Qualcomm Incorporated | Methods and apparatus for transport protocol ACK aggregation |
US11831742B2 (en) * | 2019-12-12 | 2023-11-28 | Intel Corporation | Semi-flexible packet coalescing control path |
US20200120190A1 (en) * | 2019-12-12 | 2020-04-16 | Linden Cornett | Semi-flexible packet coalescing control path |
EP4131880A4 (en) * | 2020-04-17 | 2023-03-15 | Huawei Technologies Co., Ltd. | METHOD AND APPARATUS FOR PROCESSING A STATEFUL SERVICE |
US20230029796A1 (en) * | 2020-04-17 | 2023-02-02 | Huawei Technologies Co., Ltd. | Stateful service processing method and apparatus |
US12199883B2 (en) * | 2020-04-17 | 2025-01-14 | Huawei Technologies Co., Ltd. | Stateful service processing method and apparatus |
US11909851B2 (en) * | 2021-10-04 | 2024-02-20 | Nxp B.V. | Coalescing interrupts based on fragment information in packets and a network controller for coalescing |
US20230103738A1 (en) * | 2021-10-04 | 2023-04-06 | Nxp B.V. | Coalescing interrupts based on fragment information in packets and a network controller for coalescing |
Also Published As
Publication number | Publication date |
---|---|
CN101253745B (en) | 2011-06-22 |
US20070033301A1 (en) | 2007-02-08 |
US7684344B2 (en) | 2010-03-23 |
US20100174824A1 (en) | 2010-07-08 |
KR20080042812A (en) | 2008-05-15 |
US20080310420A1 (en) | 2008-12-18 |
KR100973201B1 (en) | 2010-07-30 |
US20070076623A1 (en) | 2007-04-05 |
WO2007069095A3 (en) | 2007-12-06 |
CN101253745A (en) | 2008-08-27 |
US8064459B2 (en) | 2011-11-22 |
WO2007069095A2 (en) | 2007-06-21 |
US7693138B2 (en) | 2010-04-06 |
US8416768B2 (en) | 2013-04-09 |
US20100198984A1 (en) | 2010-08-05 |
US20070014245A1 (en) | 2007-01-18 |
US8274976B2 (en) | 2012-09-25 |
EP1917782A2 (en) | 2008-05-07 |
WO2007069095A8 (en) | 2007-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8064459B2 (en) | Method and system for transparent TCP offload with transmit and receive coupling | |
US7912064B2 (en) | System and method for handling out-of-order frames | |
US7596628B2 (en) | Method and system for transparent TCP offload (TTO) with a user space library | |
EP2887595B1 (en) | Method and node for retransmitting data packets in a tcp connection | |
US8174975B2 (en) | Network adapter with TCP support | |
US7346701B2 (en) | System and method for TCP offload | |
US20070022212A1 (en) | Method and system for TCP large receive offload | |
US8259728B2 (en) | Method and system for a fast drop recovery for a TCP connection | |
US7912060B1 (en) | Protocol accelerator and method of using same | |
EP1460804A2 (en) | System and method for handling out-of-order frames (fka reception of out-of-order tcp data with zero copy service) | |
HK1126049A (en) | Method and system for unloading transparent tcp |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BROADCOM ISRAEL RESEARCH LTD., ISRAEL Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ALONI, ELIEZER;SHALOM, RAFI;MIZRACHI, SHAY;AND OTHERS;REEL/FRAME:018578/0078;SIGNING DATES FROM 20060615 TO 20060626 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001 Effective date: 20160201 |
|
AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001 Effective date: 20170120 |
|
AS | Assignment |
Owner name: BROADCOM CORPORATION, CALIFORNIA Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041712/0001 Effective date: 20170119 |