HK1180802B

HK1180802B - Method and system for load instruction for communicating with adapters

Info

Publication number: HK1180802B
Application number: HK13108100.5A
Authority: HK
Inventors: D.格雷纳; D．克拉多克; T．格雷格; M.法雷尔
Original assignee: 国际商业机器公司
Priority date: 2010-06-23
Filing date: 2010-11-08
Publication date: 2015-12-18

Description

Method and system for loading instructions in communication with an adapter

Technical Field

The present invention relates generally to input/output processing of computing environments, and more particularly to facilitating communication with adapters of computing environments.

Background

The computing environment may include one or more types of input/output devices, including various types of adapters. One type of adapter is a Peripheral Component Interconnect (PCI) or peripheral component interconnect standard (PCIe) adapter. The adapter includes one or more address spaces for use in transferring data between the adapter and a system to which the adapter is connected. The PCI specification is available from the world Wide Web www.pcisig.com/home.

U.S. patent No. 6,704,831 entitled "Method and Apparatus for Converting Address Information Between PCI Bus Protocol and a Message-paging Queue-organized Bus Protocol", issued by Avery on 9.3.2004, describes PCI load/store operations and implements DMA operations via pairs of work queues in a Message-Passing, Queue-Oriented Bus architecture. The PCI address space is divided into segments and then each segment is divided into regions. A separate work queue is assigned to each segment. The first portion of the PCI address matches the address range represented by the segment and is used to select the memory segment and its corresponding work queue. An entry in the work queue holds a second portion of the PCI address that specifies a region in the selected segment that is assigned to a particular PCI device. In one embodiment, PIO load/store operations are implemented by selecting a work queue assigned to the PIO operation and creating a work queue entry using a PCI address of a register on the PCI device and a pointer to the PIO data. The work queue entry is sent to the PCI bridge where the PCI address is extracted and the appropriate device register is programmed with the PCI address using the data of the data pointer. DMA transfers may also be implemented by comparing a portion of the PCI address generated by the PCI device to an address range table to select a work queue and selecting a work queue that services the address range. The remainder of the PCI address and a pointer to DMA data are used to create a work queue entry. DMA transfers are performed using RDMA operations. The page and region data is used in conjunction with a translation protection table in the host channel adapter to access physical memory and perform DMA transfers.

A computer system is described by Kjos et al, entitled "partial virtualization an I/ODevice for Use by Virtual machine", published on 11/3/2009, which includes a physical computer executable on a physical computer and a Virtual machine monitor, and which is configured to create a simulation of at least one guest operating system adapted to control the physical computer. The computer system further includes a host executable on a physical computer that manages physical resources coupled to the physical computer on behalf of the virtual machine monitor and the at least one guest operating system. The host is adapted to virtualize a Peripheral Component Interconnect (PCI) configuration address space, whereby at least one guest operating system controls PCI input/output (I/O) devices directly and without I/O emulation.

In some systems, a portion of an address space of a Central Processing Unit (CPU) coupled to the adapter is mapped to the adapter's address space so that CPU instructions accessing memory can directly manipulate data in the adapter's address space.

Disclosure of Invention

According to an aspect of the present invention, a capability is provided for facilitating communication with an adapter, such as a PCI or PCIe adapter. Control instructions specifically designed to transfer data to and from the adapter are provided and used for communication.

The shortcomings of the prior art are overcome and advantages are provided through the provision of a computer program product for executing a load instruction for loading data from an adapter. The computer program product includes a computer readable storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method. For example, the method includes obtaining machine instructions for execution, defining machine instructions for computer execution according to a computer architecture, the machine instructions including: identifying a loaded opcode field from the adapter instruction; a first field identifying a first location from which data retrieved from an adapter is to be loaded; a second field identifying a second location, the contents of the second field including a function handle identifying the adapter, a designation of an address space within the adapter from which data is loaded, and an offset within the address space; and executing the machine instruction, the executing comprising: obtaining a function table entry associated with the adapter using the function handle; obtaining a data address of the adapter using at least one of information in a function table entry and the offset; and retrieving data from a particular location in the address space identified by the designation of the address space, wherein the particular location is identified by the data address of the adapter.

Methods and systems relating to one or more aspects of the present invention are also described and claimed herein.

Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.

Drawings

Preferred embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which:

FIG. 1A depicts one embodiment of a computing environment to incorporate and use one or more aspects of the present invention;

FIG. 1B illustrates one embodiment of a device table entry located in the I/O hub of FIG. 1A and used in accordance with an aspect of the present invention;

FIG. 1C depicts another embodiment of a computing environment to incorporate and use one or more aspects of the present invention;

FIG. 2 illustrates one example of an address space of an adapter function in accordance with an aspect of the present invention;

FIG. 2B depicts one embodiment of a function handle used to locate a function table entry, in accordance with an aspect of the present invention;

FIG. 3A illustrates one example of a data entry used in accordance with an aspect of the present invention;

FIG. 3B depicts one embodiment of a function handle used in accordance with an aspect of the present invention;

FIG. 4A depicts one embodiment of a PCI load execution, used in accordance with an aspect of the present invention;

FIG. 4B depicts one embodiment of the fields used by the PCI load instruction of FIG. 4A, in accordance with an aspect of the present invention;

FIG. 4C depicts one embodiment of another field used by the PCI load instruction of FIG. 4A, in accordance with an aspect of the present invention;

FIGS. 5A-5B illustrate one embodiment of logic to perform a PCI load operation, in accordance with an aspect of the present invention;

FIG. 6A depicts one embodiment of a PCI store instruction, used in accordance with an aspect of the present invention;

FIG. 6B depicts one embodiment of fields used by the PCI store instruction of FIG. 6A, in accordance with an aspect of the present invention;

FIG. 6C depicts one embodiment of another field used by the PCI store instruction of FIG. 6A, in accordance with an aspect of the present invention;

FIGS. 7A-7B illustrate one embodiment of logic to perform a PCI store operation, in accordance with an aspect of the present invention;

FIG. 8A depicts one embodiment of a PCI memory block used in accordance with an aspect of the present invention;

FIG. 8B depicts one embodiment of fields used by the PCI store Block instruction of FIG. 8A, in accordance with an aspect of the present invention;

FIG. 8C depicts one embodiment of another field used by the PCI store Block instruction of FIG. 8A, in accordance with an aspect of the present invention;

FIG. 8D depicts one embodiment of yet another field used by the PCI store Block instruction of FIG. 8A, in accordance with an aspect of the present invention;

9A-9B illustrate one embodiment of logic to perform a PCI store Block operation in accordance with an aspect of the present invention;

FIG. 10 depicts one embodiment of a computer program product incorporating one or more aspects of the present invention;

FIG. 11 depicts one embodiment of a host computer system to incorporate and use one or more aspects of the present invention;

FIG. 12 illustrates another example of a computer system that incorporates and uses one or more aspects of the present invention;

FIG. 13 illustrates another example of a computer system including and using a computer network in accordance with one or more aspects of the present invention;

FIG. 14 depicts one embodiment of various elements of a computer system that incorporates and uses one or more aspects of the present invention;

FIG. 15A depicts one embodiment of an execution unit of the computer system of FIG. 14 to incorporate and use one or more aspects of the present invention;

FIG. 15B depicts one embodiment of a branching unit of the computer system of FIG. 14 that incorporates and uses one or more aspects of the present invention;

FIG. 15C depicts one embodiment of a load/store unit of the computer system of FIG. 14 incorporating and using one or more aspects of the present invention; and

FIG. 16 depicts one embodiment of an emulated host computer system incorporating and using one or more aspects of the present invention.

Detailed Description

According to an aspect of the invention, one or more control instructions are provided to facilitate communication with an adapter of a computing environment. The control instructions are specifically designed for communicating data to and from the address space of the adapter.

Moreover, as used herein, the term adapter includes any type of adapter (e.g., storage adapter, network adapter, processing adapter, PCI adapter, network adapter, cryptographic adapter, other types of input/output adapters, etc.). In one embodiment, an adapter includes an adapter function. However, in other embodiments, an adapter may include multiple adapter functions. One or more aspects of the present invention may be applied regardless of whether an adapter includes one adapter function or multiple adapter functions. In one embodiment, if the adapter includes multiple adapter functions, each function may communicate in accordance with an aspect of the present invention. Further, in the examples presented herein, adapters are used interchangeably with adapter functions (e.g., PCI functions) unless otherwise noted.

One embodiment of a computing environment to incorporate and use one or more aspects of the present invention is described with reference to FIG. 1A. In one example, the computing environment 100 is a System provided by International Business machines corporationAnd (4) a server. SystemThe server is based on the service provided by International Business machines corporationAboutDetails of (A) are published in IBMThe publication is entitled "z/Architecture Principles of Operation", IBM publication No. SA22-7832-07, month 2 2009.SystemAndis a registered trademark of international business machines corporation, armonk, new york. Other names used herein may be registered trademarks, trademarks or product names of International Business machines corporation or other companies.

In one example, computing environment 100 includes one or more Central Processing Units (CPUs) 102 coupled to a system memory 104 (also referred to as main memory) via a memory controller 106. To access system memory 104, central processing unit 102 issues a read or write request that includes an address that is used to access system memory. The address included in the request is typically not directly usable to access system memory, and therefore, it is converted to an address that is directly usable to access system memory. The address is translated via a translation mechanism (XLATE) 108. For example, addresses are translated from virtual addresses to real or absolute addresses using, for example, Dynamic Address Translation (DAT).

A request including an address (translated if necessary) is received by memory controller 106. In one example, the memory controller 106 contains hardware and is used to arbitrate for access to system memory and to maintain memory coherency. This arbitration is performed for requests received from the CPU 102 and requests received from one or more adapters 110. Similar to a central processing unit, the adapter issues a request to system memory 104 to obtain access to the system memory.

In one example, adapter 110 is a Peripheral Component Interconnect (PCI) or PCI Express (PCIe) adapter that includes one or more PCI functions. The PCI function issues a request that is routed to an input/output hub 112 (e.g., a PCI hub) via one or more switches (e.g., PCIe switches) 114. In one example, the input/output hub includes hardware including one or more state machines and is coupled to the memory controller 106 via an I/O-to-memory bus 120.

The input/output hub includes, for example, a root complex 116 that receives requests from switches. The request includes an input/output address that is provided to the address translation and protection unit 118, and the address translation and protection unit 118 accesses information for the request. As examples, the request may include an input/output address used to perform a Direct Memory Access (DMA) operation or request a Message Signaled Interrupt (MSI). The address translation and protection unit 118 accesses information for the DMA or MSI request. As a particular example, for DMA operations, information may be obtained to translate addresses. The translated address is then forwarded to the memory controller to access the system memory.

In one example, the information for the DMA or MSI request issued by the adapter is obtained from a device table entry 130 of a device table 132 located in the I/O hub (e.g., in the address translation and protection unit), as described with reference to FIG. 1B. The device table entries include information for adapters, and each adapter has at least one device table entry associated therewith. For example, there is one device table entry per address space (in system memory) assigned to an adapter. For requests issued from an adapter (e.g., PCI function 138), the device table entry is located using the request ID provided in the request.

Referring now to FIG. 1C, in a further embodiment of a computing environment, a central processing complex is coupled to a memory controller 106 in addition to or in place of one or more CPUs 102. In this example, the central processing complex 150 includes, for example, one or more partitions or regions 152 (e.g., logical partitions LP 1-LPn), one or more central processors (e.g., CP 1-CPm) 154, and a hypervisor 156 (e.g., a logical partition manager), each of which is described below.

Each logical partition 152 can function as a separate system. That is, each logical partition may be independently reset, utilize an operating system or hypervisor (e.g., as provided by International Business machines corporation of Armonk, N.Y.)) An initial load is made (if needed) and the operation is performed with a different program. An operating system, hypervisor, or application running in a logical partition appears to have access to a complete and complete system, but only a portion of it is available. The hardware and licensed internal code (also referred to as microcode or millicode) keep programs in the logical partitions from interfering with programs in different logical partitions. This allows multiple different logical groups to operate on a single or multiple physical processors in a time sliced manner. In this particular example, each logical partition has a resident operating system 158, which may be different for one or more logical partitions. In one embodiment, the operating system 158 is provided by International Business machines corporation of Armonk, N.YOr the zLinux operating system.Andis a registered trademark of international business machines corporation of armonk, new york.

Central processors 154 are physical processor resources allocated to logical partitions. For example, logical partition 152 includes one or more logical processors, each of which represents all of the physical processor resources 154 allocated to the partition or a share of the physical processor resources 154. The underlying (undersying) processor resources may be dedicated to the partition or shared with another partition.

The logical partitions 152 are managed by a hypervisor 156 implemented by firmware running on the processors 154. Logical partition 152 and hypervisor 156 each comprise one or more programs that reside in respective portions of the central storage associated with the central processors. One example of hypervisor 156 is a processor resource/system manager (PR/SM) provided by International Business machines corporation of Armonk, N.Y..

As used herein, firmware includes, for example, microcode, millicode (millicode), and/or macrocode of a processor. It includes, for example, hardware-level instructions and/or data structures for implementing higher-level machine code. In one embodiment, it includes, for example, proprietary (proprietary) code that is typically delivered as microcode that includes trusted software or microcode specific to the underlying hardware and controls operating system access to system hardware.

Although a central processing complex with logical partitions is described in this example, one or more aspects of the present invention may be incorporated into or used by other processing units, including single or multiple ones of multiple processing units that are not partitioned. The central processing complex described herein is merely an example.

As described above, the adapter may issue requests to the processor to request various operations, such as direct memory access, message signaled interrupts, and the like. Further, the processor may issue a request to the adapter. For example, returning to FIG. 1B, the processor 102 may issue a request to access the adapter function 138. The request is routed from the processor to the adapter function via the I/O hub 112 and one or more switches 114. In the present embodiment, the memory controller is not shown. However, the I/O hub may be coupled to the processor directly or via a memory controller.

As an example, operating system 140 executing in the processor issues instructions to the adapter function to request a particular operation. In this example, the instructions issued by the operating system are specific to the I/O architecture base device. That is, since the I/O fabric infrastructure device is based on PCI or PCIe (both are referred to herein as PCI unless otherwise noted), the instruction is a PCI instruction. Example PCI instructions include PCI loads, PCI stores, and modify PCI function controls, to name a few. Although in this example the I/O fabric infrastructure and execution are PCI based, in other embodiments, other fabric infrastructure and corresponding instructions may be used.

In one particular example, execution is directed to a particular location within the address space of the adapter function. For example, as shown in FIG. 2, the adapter function 138 includes a memory 200, which is defined as a plurality of address spaces, including, for example: configuration space 202 (e.g., PCI configuration space for PCI functions), I/O space 204 (e.g., PCI I/O space), and one or more memory spaces 206 (e.g., PCI memory spaces). In other embodiments, more, fewer, or different address spaces may be provided. An instruction is directed to a particular address space and a particular location within the address space. This ensures that the configuration (e.g., operating system, LPAR, processor, guest, etc.) that issued the instruction is authorized to access the adapter function.

To facilitate processing of the instructions, information stored in one or more data structures is used. One such data structure that includes information about the adapter is a function table 300 stored, for example, in secure memory. As shown in FIG. 3A, in one example, the function table 300 includes one or more Function Table Entries (FTEs) 302. In one example, there is one function table entry per adapter. Each function table entry 302 includes information used in the processing associated with its adapter function. In one example, the function table entry 302 includes, for example:

example number 308: this field indicates the particular instance of the function handle associated with the function table entry;

device Table Entry (DTE) index 1.. n 310: there may be one or more device table indices, and each index is an index in the device table that locates a Device Table Entry (DTE). There may be one or more device table entries per adapter function, and each entry includes information associated with its adapter function, including information for handling requests by the adapter function (e.g., DMA requests, MSI requests) and information related to requests to the adapter function (e.g., PCI instructions). Each device table entry is associated with an address space in system memory allocated to the adapter function. The adapter function may have one or more address spaces within system memory that are assigned to the adapter function.

Busy indicator 312: this field indicates whether the adapter function is busy;

permanent error status indicator 314: this field indicates whether the adapter function is in a permanent error state;

resume initiation indicator 316: this field indicates whether recovery for the adapter function has been initiated;

an enable indicator 318; this field indicates whether the operating system attempting to enable the adapter function is authorized to do so;

enable indicator 320: this field indicates whether the adapter function is enabled (e.g., 1= enabled, 0= disabled);

requester id (rid) 322: which is an identifier of the adapter function and which may include an adapter bus number, a device number, and a function number. This field is used for accessing the configuration of the configuration space of the adapter function.

For example, the configuration space may be accessed by specifying the configuration space in an instruction issued by the operating system (or other configuration) to the adapter function. Specified in the instruction are an offset in the configuration space and a function handle for locating the appropriate function table entry including the RID. The firmware receives the instruction and determines whether it is for a configuration space. Thus, it uses the RID to generate requests to the I/O hub, and the I/O hub creates requests to access the adapter. The location of the adapter function is based on the RID, and the offset specifies an offset in the configuration space of the adapter function. For example, the offset specifies the number of registers in the configuration space.

Base Address Register (BAR) (1 to n) 324: this field includes a plurality of unsigned integers, denoted BAR0 through BARN, that are associated with the originally specified adapter function and whose values are also stored in a base address register associated with the adapter function. Each BAR specifies the starting address of a memory space or an I/O space within the adapter function and also indicates the type of address space, i.e. whether it is a 64 or 32 bit memory space, or a 32 bit I/O space, for example;

in one example, it is used to access memory space and/or I/O space of an adapter function. For example, an offset provided in the instruction to access the adapter function is added to a value in a base address register associated with the address space specified in the instruction to obtain an address to be used to access the adapter function. The address space identifier provided in the instruction identifies the address space within the adapter function to be accessed and the corresponding BAR to be used.

Size 1 … n 326: this field includes a number of unsigned integers denoted SIZE0 through SIZE. The value of the size field (when non-zero) represents the size of each address space, with each entry corresponding to the BAR described previously.

Further details regarding the BAR and dimensions are described below.

1. When a BAR is not implemented for the adapter function, both the BAR field and its corresponding size field are stored as zeros.

2. When the BAR field represents an I/O address space or a 32-bit memory address space, the corresponding size field is non-zero and represents the size of the address space.

3. When the BAR field represents a 64-bit memory address space,

the barn field represents the least significant address bits.

b. The next consecutive BARn +1 field represents the most significant address bits.

c. The corresponding size field is non-zero and represents the size of the address space.

d. The corresponding size +1 field is meaningless and is stored as zero.

Internal routing information 328: this information is used to perform a special routing to the adapter. For example, it includes nodes, processor chips, and I/O hub addressing information.

Status indicator 330: this provides an indication of whether to block the load/store operation, among other indications.

In one example, the busy indicator, the permanent error status indicator, and the recovery initiation indicator are set based on monitoring performed by firmware. Further, the permission indicator is set based on the policy, for example. The BAR information is based on configuration information discovered during bus walks of the processor (e.g., firmware of the processor). Other fields may be set based on configuration, initialization, and/or events. In other embodiments, the function table entry may include more, less, or different information. The information included may be based on operations supported or enabled by the adapter function.

To locate a function table entry in a function table that includes one or more entries, in one embodiment, a function handle is used. For example, one or more bits of the function handle are used as an index into the function table to locate a particular function table entry.

Referring to FIG. 3B, additional details regarding the function table entry are described. In one example, the function handle 350 includes an enable indicator 352 indicating whether the PCI function handle is enabled; PCI function number 354 (which is a static identifier and, in one embodiment, an index into the function table) that identifies the function; and an instance number 356 indicating the particular instance of the function handle. For example, each time a function is enabled, the instance number is incremented to provide a new instance number.

According to an aspect of the invention, in order to access an adapter function, a configuration issues a request to the adapter function to be executed by a processor. In the examples herein, the configuration is an operating system, but in other examples it may be a system, processor, logical partition, client, etc. These requests are via specific instructions that access the adapter. Example instructions include PCI load, PCI store, and PCI store block instructions. These instructions are specific to the adapter architecture (e.g., PCI). Further details regarding these instructions are described below. For example, one embodiment of a PCI load instruction is described with reference to FIGS. 4A-5B; one embodiment of a PCI store instruction is described with reference to FIGS. 6A-7B; and one embodiment of a PCI store Block instruction is described with reference to FIGS. 8A through 9B.

Referring initially to FIG. 4A, one embodiment of a PCI load instruction is illustrated. As shown, PCI load instruction 400 includes, for example, opcode 902, indicating a PCI load instruction; a first field 404 specifying a location where data obtained from the adapter function is to be loaded; and a second field 406 specifying a location including various information related to the adapter function from which the data was loaded. The contents of the locations specified by fields 1 and 2 are further described below.

In one example, field 1 specifies a general purpose register, and as shown in FIG. 4B, the contents 404 of the register comprise a contiguous range of one or more bytes loaded from the adapter location specified in the instruction. In one example, data is loaded into the rightmost byte position of the register.

In one embodiment, field 2 indicates a general register pair that includes various instructions. As shown in fig. 4B, the contents of the register include, for example:

enable handler 410: this field is the enable function handle of the adapter function from which the data was loaded;

address space 412: this field identifies the address space within the adapter function from which the data is loaded;

offset 414 in address space: this field specifies an offset within the specified address space from which the data is loaded;

length field 416: this field specifies the length of the load operation (e.g., the number of bytes to be loaded); and

status field 418: this field provides a status code that can be used when the instruction ends with a predetermined condition code.

In one embodiment, the bytes loaded from the adapter function are contained within the integer limits of the adapter function's specified PCI address space. The integer bound size is, for example, a double word when the address space field specifies the memory address space. When the address space field specifies an I/O address space or a configuration address space, the integer bound size is, for example, a doubleword.

One embodiment of the logic associated with a PCI load instruction is described with reference to FIGS. 5A-5B. In one example, the instructions are issued by an operating system (or other configuration) and executed by a processor (e.g., firmware) executing the operating system. In the example herein, the instruction and adapter functions are PCI based. However, in other examples, different adapter architectures and corresponding instructions may be used.

To issue an instruction, the operating system provides the instruction (e.g., in one or more registers specified by the instruction) with the following operators: the PCI function handle, the PCI address space (PCIAS), the offset in the PCI address space, and the length of the data to be loaded. Upon successful completion of the PCI load instruction, the data is loaded in the location (e.g., register) specified by the instruction.

Referring to FIG. 5A, initially, a determination is made as to whether a facility is installed that allows PCI load instructions, INQUIRY 500. This determination is made, for example, by examining an indicator stored in the control block. If no tools are installed, an exception condition is provided, step 502. Otherwise, a determination is made as to whether the operator is aligned, INQUIRY 504. For example, if certain operators need to be odd/even register pairs, a determination is made whether these requirements are met. If the operator is not aligned, an exception is raised, STEP 506. Otherwise, if the tool is installed and the operators are aligned, a determination is made as to whether the handle provided in the operator of the PCI load instruction is enabled, INQUIRY 508. In one example, this determination is made by examining an enable indicator in the handle. If the handle is not enabled, an exception condition is provided, STEP 510.

If the handle is enabled, the handle is used to locate the function table entry, STEP 512. That is, at least a portion of the handle is used as an index into the function table to locate the function table entry corresponding to the adapter function from which the data was loaded.

Thereafter, if the configuration from which the instruction is issued is a guest, a determination is made as to whether the function is configured for use by the guest, INQUIRY 514. If not, an exception condition is provided, step 516. The query may be ignored if the configuration is not a customer or other authorization may be checked (if specified). (in one example, inThe pageable guest is interpretively executed at a second level of interpretation via a begin interpretive execution (SIE) instruction. For example, a Logical Partition (LPAR) hypervisor executes the SIE instruction to begin a logical partition in physical, fixed memory. If it is notIs the operating system in the logical partition that issues the SIE instruction to execute its guest (virtual) machine in its V = V (virtual) memory. Thus, LPAR hypervisor uses level 1 SIE andusing level 2 SIE).

A determination is then made as to whether the function is enabled, INQUIRY 518. In one example, this determination is made by checking an enable indicator in the function table entry. If it is not enabled, an exception condition is provided, step 520.

If the function is enabled, a determination is made as to whether the address space is valid, INQUIRY 522. For example, whether the specified address space is that of an adapter function and is appropriate for the instruction. If the address space is not valid, an exception condition is provided, step 524. Otherwise, a determination is made whether to block the load/store, INQUIRY 526. In one example, this determination is made by checking a status indicator in the function table entry. If the load/store is blocked, then an exception condition is provided, step 528.

However, if the load/store is not blocked, a determination is made as to whether the restore is active, INQUIRY 530. In one example, this determination is made by checking a recovery initiation indicator in the function table entry. If recovery is active, an exception condition is provided, step 532. Otherwise, a determination is made as to whether the function is busy, INQUIRY 534. This determination is made by checking the busy indicator in the function table entry. If the function is busy, a busy condition is provided, step 536. With the busy condition, the instruction may be retried instead of being discarded.

If the function is not busy, a further determination is made as to whether the offset specified in the instruction is valid, INQUIRY 538. That is, the offset is combined with the length of the operation within the base address and the length of the address space as specified in the function table entry? If not, an exception condition is provided, step 540. However, if the offset is valid, then a determination is made as to whether the length is valid, INQUIRY 542. That is, the address space type, the offset within the address space, and the integer bound size are made length-efficient. If not, an exception condition is provided, step 544. Otherwise, processing continues with the load instruction. (in one embodiment, the firmware performs the checks described above).

Continuing with FIG. 5B, a determination is made by firmware to load the configuration address space for the adapter function, INQUIRY 550. That is, based on the configuration of the memory of the adapter function, is the particular address space provided in the instruction the configuration space? If so, the firmware performs various processes to provide the request to the hub coupled to the adapter function; the hub then routes the request to the function, step 552.

For example, the firmware obtains the requestor ID from the function table entry pointed to by the function handle provided in the instruction operator. Further, based on information in the function table entry (e.g., internal routing information), the firmware determines the hub that received the request. That is, the environment may have one or more hubs, and the firmware determines the hub coupled to the adapter function. It then forwards the request to the hub. The hub generates a configuration read request packet that flows out of the PCI bus to the adapter function identified by the RID in the function table entry. The configuration read request includes the RID and offset (i.e., data address) for obtaining the data, as described below.

Returning to INQUIRY 550, if the specified address space is not a configuration space, then the firmware again performs various processes to provide the request to the hub, STEP 554. The firmware uses the handle to select a function table entry and it obtains information from the entry that locates the appropriate hub. It also calculates the data address used in the load operation. The address is calculated by adding the BAR start address obtained from the function table entry (using the BAR associated with the address space identifier provided in the instruction) to the offset provided in the instruction. Providing the calculated data address to a hub. The hub then obtains the address and includes it in a request packet, such as a DMA read request packet, that is streamed over the PCI bus to the adapter function.

In response to receiving the request, either via step 552 or step 554, the adapter function retrieves the requested data from the particular location (i.e., at the data address) and returns the data in response to the request, step 556, forwarding the response from the adapter function to the I/O hub. In response to receiving the request, the hub forwards the response to the initiating processor. The initiating processor then retrieves the data from the response packet and loads the data at the specified location specified in the instruction (e.g., field 1404). the PCI load operation ends with a successful indication (e.g., a condition code set to zero).

In addition to a load instruction that fetches data from an adapter function and stores it in a specified location, another instruction that may be executed is a store instruction. The store instruction stores data at a specified location of the adapter function. One embodiment of a PCI store instruction is described with reference to FIG. 6A. As shown, PCI store instruction 600 includes, for example, opcode 602, indicating a PCI store instruction; a first field 604 specifying a location including data to be stored at the adapter function; and a second field 606 that specifies a location that includes various information related to the adapter function to which the data is to be stored. The contents of the locations specified by fields 1 and 2 are further described below.

In one example, field 1 specifies a general purpose register, and as shown in FIG. 6B, the contents 604 of the register comprise one or more bytes of a contiguous range of data to be stored to a specified location of the adapter function. In one example, the data of the rightmost byte position of the register is stored.

In one embodiment, field 2 indicates a general register pair that includes various instructions. As shown in fig. 6B, the contents of the register include, for example:

enable handle 610: this field is the enabled function handle of the adapter function to which the data is stored;

the address space 612: this field identifies the address space within the adapter function to which the data is stored;

offset 614 within address space: this field specifies an offset within the specified address space to which the data is stored;

length field 616: this field specifies the length of the store operation (e.g., the number of bytes to be stored); and

the status field 618: this field provides a status code that can be used when the instruction ends with a predetermined condition code.

One embodiment of the logic associated with a PCI store instruction is described with reference to FIGS. 7A-7B. In one example, instructions are issued by an operating system and executed by a processor (e.g., firmware) executing the operating system.

To issue an instruction, the operating system provides the instruction (e.g., in one or more registers specified by the instruction) with the following operators: a PCI function handle, PCI Address space (PCIAS), an offset in PCI Address space, a length of data to be stored, and a pointer to data to be stored. Upon successful completion of the PCI store instruction, the data is stored in the location specified by the instruction.

Referring to FIG. 7A, initially, a determination is made as to whether a facility is installed that allows PCI store instructions, INQUIRY 700. This determination is made, for example, by examining an indicator stored in the control block. If no tools are installed, an exception condition is provided, step 702. Otherwise, a determination is made as to whether the operator is aligned, INQUIRY 704. For example, if certain operators need to be odd/even register pairs, a determination is made whether these requirements are met. If the operator is not aligned, an exception is provided, step 706. Otherwise, if the tool is installed and the operators are aligned, a determination is made as to whether the handle provided in the operator of the PCI store instruction is enabled, INQUIRY 708. In one example, this determination is made by examining an enable indicator in the handle. If the handle is not enabled, an exception condition is provided, STEP 710.

If the handle is enabled, the handle is used to locate the function table entry, STEP 712. That is, at least a portion of the handle is used as an index into the function table to locate the function table entry corresponding to the adapter function to which the data is stored.

Thereafter, if the configuration from which the instruction is issued is a guest, a determination is made as to whether the function is configured for use by the guest, INQUIRY 714. If not, an exception condition is provided, step 716. If the configuration is not a customer, the query may be ignored, or other authorizations may be checked, if specified.

A determination is then made as to whether the function is enabled, INQUIRY 718. In one example, this determination is made by checking an enable indicator in the function table entry. If it is not enabled, an exception condition is provided, step 720.

If the function is enabled, a determination is made as to whether the address space is valid, INQUIRY 722. For example, whether the specified address space is that of an adapter function and is appropriate for the instruction. If the address space is not valid, an exception condition is provided, step 724. Otherwise, a determination is made whether to block the load/store, INQUIRY 726. In one example, this determination is made by checking a status indicator in the function table entry. If the load/store is prevented, then an exception condition is provided, step 728.

However, if the load/store is not blocked, a determination is made as to whether the restore is active, INQUIRY 730. In one example, this determination is made by checking a recovery initiation indicator in the function table entry. If recovery is active, an exception condition is provided, step 732. Otherwise, a determination is made as to whether the function is busy, INQUIRY 734. This determination is made by checking the busy indicator in the function table entry. If the function is busy, a busy condition is provided, step 736. With the busy condition, the instruction may be retried instead of being discarded.

If the function is not busy, a further determination is made as to whether the offset specified in the instruction is valid, INQUIRY 738. That is, the offset is combined with the length of the operation within the base address and the length of the address space as specified in the function table entry? If not, an exception condition is provided, step 740. However, if the offset is valid, then a determination is made as to whether the length is valid, INQUIRY 742. That is, the address space type, the offset within the address space, and the integer bound size are made length-efficient. If not, an exception condition is provided, step 744. Otherwise, processing continues with the store instruction. (in one embodiment, the firmware performs the checks described above).

Continuing with FIG. 7B, a determination is made by firmware as to whether to store the configuration address space for the adapter function, INQUIRY 750. That is, based on the configuration of the memory of the adapter function, is the particular address space provided in the instruction the configuration space? If so, the firmware performs various processes to provide the request to the hub coupled to the adapter function; the hub then routes the request to the function, step 752.

For example, the firmware obtains the requestor ID from the function table entry pointed to by the function handle provided in the instruction operator. Further, based on information in the function table entry (e.g., internal routing information), the firmware determines the hub that received the request. That is, the environment may have one or more hubs, and the firmware determines the hub coupled to the adapter function. It then forwards the request to the hub. The hub generates a configuration write request packet that flows out of the PCI bus to the adapter function identified by the RID in the function table entry. The configuration write request includes a RID and an offset (i.e., data address) for storing data, as described below.

Returning to INQUIRY 750, if the specified address space is not a configuration space, then the firmware again performs various processes to provide the request to the hub, STEP 754. The firmware uses the handle to select a function table entry and it obtains information from the entry that locates the appropriate hub. It also calculates the data address used in the store operation. The address is calculated by adding the BAR start address obtained from the function table entry to the offset provided in the instruction. Providing the calculated data address to a hub. The hub then obtains the address and includes it in a request packet, such as a DMA write request packet, that is streamed over the PCI bus to the adapter function.

In response to receiving the request, either via step 752 or step 754, the adapter function stores the requested data in a specified location (i.e., at the data address), step 756. The PCI store operation is completed with an indication of success (e.g., a condition code set to zero).

In addition to load and store instructions, which typically load or store up to, for example, 8 bytes, another instruction that can be executed is a store block instruction. The store block instruction stores a larger block of data (e.g., 16, 32, 64, 128, or 256 bytes) at a specified location of the adapter function; the block size is not necessarily limited to a size that is a power of 2. In one example, the specified location is in a memory space (not an I/O or configuration space) of the adapter function.

One embodiment of a PCI store Block instruction is described with reference to FIG. 8A. As shown, PCI store block instruction 800 includes, for example, opcode 802, indicating a PCI store block instruction; a first field 804 specifying a location including various information about an adapter function to which data is stored; a second field 806 specifying a location that includes an offset within a specified address space to which data is stored; and a third field 808 specifying the location of an address in system memory that includes data to be stored in the adapter function. The contents of the locations specified by fields 1, 2, and 3 are further described below.

In one example, field 1 specifies a general register that includes various information. As shown in fig. 8B, the contents of the register include, for example:

enable handle 810: this field is the enabled function handle of the adapter function to which the data is stored;

address space 812: this field identifies the address space within the adapter function to which the data is stored;

length field 814: this field specifies the length of the store operation (e.g., the number of bytes to be stored); and

status field 816: this field provides a status code that can be used when the instruction ends with a predetermined condition code.

In one example, field 2 specifies a general purpose register, and as shown in FIG. 8C, the contents of the register include a value (e.g., a 64-bit unsigned integer) that indicates an offset within a specified address space to which the data is to be stored.

In one example, as shown in FIG. 8D, field 3 comprises the logical address in system memory 822 of the first byte of data to be stored into the adapter function.

One embodiment of the logic associated with a PCI store Block instruction is described with reference to FIGS. 9A-9B. In one example, instructions are issued by an operating system and executed by a processor (e.g., firmware) executing the operating system.

To issue an instruction, the operating system provides the instruction (e.g., in one or more registers specified by the instruction) with the following operators: a PCI function handle, PCI Address space (PCIAS), an offset in PCI Address space, a length of data to be stored, and a pointer to data to be stored. Pointer operators may include registers and signed or unsigned shifts. Upon successful completion of the PCI store Block instruction, the data is stored in the location specified by the instruction.

Referring to FIG. 9A, initially, a determination is made as to whether a facility is installed that allows PCI store Block instructions, INQUIRY 900. This determination is made, for example, by examining an indicator stored in the control block. If no tools are installed, an exception condition is provided, step 902. Otherwise, if the tool is installed, a determination is made as to whether the handle provided in the operator of the PCI store Block instruction is enabled, INQUIRY 904. In one example, this determination is made by examining an enable indicator in the handle. If the handle is not enabled, an exception condition is provided, STEP 906.

If the handle is enabled, the handle is used to locate the function table entry, STEP 912. That is, at least a portion of the handle is used as an index into the function table to locate the function table entry corresponding to the adapter function to which the data is stored. Thereafter, if the configuration from which the instruction is issued is a guest, a determination is made as to whether the function is configured for use by the guest, INQUIRY 914. If not, an exception condition is provided, step 916. If the configuration is not a customer, the query may be ignored, or other authorizations may be checked, if specified.

A determination is then made as to whether the function is enabled, INQUIRY 918. In one example, this determination is made by checking an enable indicator in the function table entry. If it is not enabled, an exception condition is provided, step 920.

If the function is enabled, a determination is made as to whether the address space is valid, INQUIRY 922. E.g., whether the specified address space is the adapter function's specified address space and whether one of them is appropriate for the instruction (i.e., memory space). If the address space is not valid, an exception condition is provided, step 924. Otherwise, a determination is made whether to block the load/store, INQUIRY 926. In one example, this determination is made by checking a status indicator in the function table entry. If the load/store is blocked, then an exception condition is provided, step 928.

However, if the load/store is not blocked, a determination is made as to whether recovery is active, INQUIRY 930. In one example, this determination is made by checking a recovery initiation indicator in the function table entry. If recovery is active, then an exception condition is provided, step 932. Otherwise, a determination is made as to whether the function is busy, INQUIRY 934. This determination is made by checking the busy indicator in the function table entry. If the function is busy, a busy condition is provided, step 936. With the busy condition, the instruction may be retried instead of being discarded.

If the function is not busy, a further determination is made as to whether the offset specified in the instruction is valid, INQUIRY 938. That is, the offset is combined with the length of the operation within the base address and the length of the address space specified in the function table entry? If not, an exception condition is provided, step 940. However, if the offset is valid, then a determination is made as to whether the length is valid, INQUIRY 942. That is, the address space type, the offset within the address space, and the integer bound size are made length-efficient. If not, an exception condition is provided, step 944. Otherwise, processing continues with the store instruction. (in one embodiment, the firmware performs the checks described above).

Continuing with FIG. 9B, a determination is made by the firmware as to whether memory including data to be stored is accessible, INQUIRY 950. If not, an exception condition is provided, step 952. If so, the firmware performs various processes to provide the request to the hub coupled to the adapter function; the hub then routes the request to the function, step 954.

For example, the firmware uses the handle to select a function table entry and it passes the entry to obtain information to locate the appropriate hub. It also calculates the data address to be used in the memory block operation. The address is calculated by adding the BAR start address obtained from the function table entry (with the BAR identified by the address space identifier) to the offset provided in the instruction. Providing the computed data address to a hub. In addition, data referenced by the address provided in the instruction is fetched from system memory and provided to the I/O hub. The hub then obtains the address and includes it in a request packet, such as a DMA write request packet, that is streamed over the PCI bus to the adapter function.

In response to receiving the request, the adapter function stores the requested data in the specified location (i.e., at the data address), STEP 956. The PCI store block operation is completed with an indication of success (e.g., a condition code set to zero).

Described in detail above is the ability to communicate with adapters of a computing environment that use control instructions specifically designed for such communication. Communications are performed without the use of memory mapped I/O and are not limited to controlling registers in the adapter function. The instructions ensure that the configuration from which the instructions are issued is authorized to access the adapter function. Further, for a store block instruction, it ensures that the specified main storage location is located in the configured memory.

In the embodiment described herein, the adapter is a PCI adapter. As used herein, PCI refers to any adapter implemented according to a PCI-based specification defined by the peripheral component interconnect special interest group (PCI-SIG), including but not limited to PCI or PCIe. In one particular example, peripheral component interconnect express (PCIe) is a component-level interconnect standard that defines a bi-directional communication protocol for transactions between an I/O adapter and a host system. According to the PCIe standard for transmission over a PCIe bus, PCIe communications are encapsulated in packets. Transactions originating at the I/O adapter and terminating at the host system are referred to as upbound transactions. Transactions originating at the host system and terminating at the I/O adapter are referred to as downstream transactions. The PCIe topology is based on point-to-point unidirectional links that are paired (e.g., one uplink, one downlink) to form a PCIe bus. The PCIe standard is maintained and published by the PCI-SIG, as described in the background section.

As will be appreciated by one skilled in the art, the present invention may be embodied as a system, method or computer program product. Accordingly, the present disclosure may be embodied in the form of: may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software, and may be referred to herein generally as a "circuit," module "or" system. Furthermore, in some embodiments, the invention may also be embodied as a computer program product in one or more computer-readable media having computer-readable program code embodied in the medium.

Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

Referring now to FIG. 10, in one example, a computer program product 1000 includes, for instance, one or more computer-readable storage media 1002 having computer-readable program code means or logic 1004 stored thereon to provide and facilitate one or more aspects of the present invention.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The present invention is described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means (instructions) which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition to the foregoing, one or more aspects of the present invention may be provided, offered, deployed, managed, serviced, etc. by a service provider who offers management of a user's environment. For example, a service provider can create, maintain, support, etc., computer code and/or computer infrastructure that performs one or more aspects of the present invention for one or more users. The service provider, in turn, may accept payment from the user, for example, according to a subscription and/or fee agreement. Additionally or alternatively, the service provider may receive payment from the sale of advertising content to one or more third parties.

In one aspect of the invention, an application may be deployed to perform one or more aspects of the invention. As one example, deploying an application comprises providing a computer infrastructure operable to perform one or more aspects of the present invention.

As yet another aspect of the present invention, a computing infrastructure may be deployed comprising integrating computer-readable code into a computer system, wherein the code in combination with the computing system is capable of performing one or more aspects of the present invention.

As yet another aspect of the present invention, a process for integrating computing infrastructure comprising integrating computer readable code into a computer system may be provided. The computer system includes a computer-readable medium, wherein the computer medium includes one or more aspects of the present invention. The code in combination with the computer system is capable of performing one or more aspects of the present invention.

While various embodiments are described above, these are only examples. For example, computing environments of other architectures may incorporate and use one or more aspects of the present invention. By way of example, except SystemServers other than servers, such as Power Systems servers or other servers offered by International Business machines corporation, or servers of other companies, may include, use and/or benefit from one or more aspects of the present invention. Moreover, although in the examples illustrated herein, the adapters and PCI hubs are considered to be part of the server, in other embodiments, they need not be considered to be part of the server, but may simply be considered to be coupled to the system memory and/or other components of the computing environment. The computing environment need not be a server. Further, while tables are described, any data structure may be used, and the term table includes all such data structures. Further, although the adapters are PCI based, one or more aspects of the present invention may be used with other adapters or other I/O components. Adapters and PCI adapters are examples only. Further, the FTE or parameters of the FTE may not be located and maintained in secure memory including, for example, hardware (e.g., PCI function hardware). The DTE, FTE, and/or handle may include more, less, or different information, as well as any instructions or instruction fields. Many other variations are possible.

Moreover, other types of computing environments may benefit from one or more aspects of the present invention. By way of example, a data processing system suitable for storing and/or executing program code will be used that includes at least two processors coupled directly or indirectly to memory elements through a system bus. The memory elements include, for instance, local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.

Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, DASD, magnetic tape, CDs, DVDs, thumb drives, and other storage media) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the available types of network adapters.

Referring to FIG. 11, representative components of a host computer system 5000 to implement one or more aspects of the present invention are depicted. Representative host computer 5000 includes one or more CPUs in communication with computer memory (i.e., central storage) 5002, as well as I/O interfaces to storage media devices 5011 and networks 5010 for communicating with other computers or SANs and the like. The CPU 5001 conforms to an architecture having an architectural instruction set and architectural functions. The CPU 5001 may have Dynamic Address Translation (DAT) 5003 for translating program addresses (virtual addresses) to real addresses of memory. A DAT typically includes a Translation Lookaside Buffer (TLB) 5007 for caching translations so that later accesses to a block of computer memory 5002 do not require the delay of address translation. Typically, a cache 5009 is used between the computer memory 5002 and the processor 5001. The cache 5009 may be hierarchical, having a large cache available to more than one CPU, and smaller, faster (lower level) caches between the large cache and each CPU. In some embodiments, the lower level cache is split to provide separate lower level caches for instruction fetching and data accesses. In one embodiment, instructions are fetched from memory 5002 by instruction fetch unit 5004 via cache 5009. The instructions are decoded in the instruction decode unit 5006 and (in some embodiments, with other instructions) sent to the one or more instruction execution units 5008. Typically, several execution units 5008 are used, such as an arithmetic execution unit, a floating point execution unit, and a branch instruction execution unit. The specification is executed by the execution unit, accessing operands from registers or memory specified by the instruction, as needed. If an operand is to be accessed (loaded or stored) from memory 5002, load/store unit 5005 typically handles the access under the control of the instruction being executed. The instructions may be executed in hardware circuitry, or in internal microcode (firmware), or in a combination thereof.

Note that the computer system includes information in local (or main) memory, as well as addressing, protection, and reference and change records. Some aspects of addressing include address format, concept of address space, various types of addresses, and the manner in which one type of address is translated to another type of address. Some main memories include persistently allocated memory locations. The main memory provides the system with fast-access data storage that is directly addressable. Both data and programs will be loaded into main memory (from the input device) before they can be processed.

The main memory may include one or more smaller, faster-access cache memories, sometimes referred to as caches. The cache is typically physically associated with the CPU or I/O processor. The effects of the physical structure and use of different storage media are not typically observed by a program except in terms of performance.

Separate caches for instruction and data operands may be maintained. Information in a cache may be maintained as contiguous bytes on integer boundaries called cache blocks or cache lines (or simply lines). The model may provide an EXTRACT CACHE ATTRIBUTE instruction that returns the byte size of the CACHE line. The model may also provide PREFETCH DATA (prefetch data) and PREFETCH DATA relative issue (prefetch longer data) instructions that enable prefetching for storage into a data or instruction cache, or release of data from the cache.

The memory is considered to be a long horizontal string of bits. For most operations, accesses to memory are made in left-to-right order. The bit string is subdivided into units of eight bits. The eight-bit unit is called a byte, which is the basic building block for all information formats. Each byte location in memory is identified by a unique non-negative integer, which is the address of the byte location, or simply, the byte address. Adjacent byte positions have consecutive addresses, starting at 0 on the left and proceeding in left to right order. The address is an unsigned binary integer and is 24, 31 or 64 bits.

Information is transferred between the memory and the CPU or channel subsystem one byte or a group of bytes at a time. Unless otherwise specified, e.g. inA group of bytes in memory is addressed by the leftmost byte of the group. The number of bytes in a group may be implied or explicitly specified by the operation to be performed. When used in CPU operations, a group of bytes is called a field. Within each group of bytes, e.g. inIn which bits are numbered in left-to-right order. In thatIn (d), the leftmost bit is sometimes referred to as the "high order" bit and the rightmost bit is referred to as the "low order" bit. However, the number of bits is not a memory address. Only bytes can be addressed. To operate on a single bit of a byte in memory, the entire byte is accessed. The bits on a byte are numbered 0 to 7 from left to right (e.g., inIn (1). Bits in the address are numbered 8-31 or 40-63 for a 24-bit address, or 1-31 or 33-63 for a 31-bit address; they are numbered 0-63 for a 64-bit address. Any other fixed length in bytesThe bits constituting the format are numbered consecutively from 0. For error detection, and preferably for correction, one or more check bits may be passed with each byte or group of bytes. Such check bits are automatically generated by the machine and cannot be directly controlled by the program. The storage capacity is expressed in number of bytes. When the length of a memory operand field is implied by the opcode of the instruction, the field is said to have a fixed length, which may be one, two, four, eight, or sixteen bytes. Larger fields may be implied for some instructions. When the length of the memory operand field is not implied but explicitly indicated, the field is said to have a variable length. Variable length operands may be variable in length in increments of one byte (or for some instructions, in multiples of two bytes or other multiples). When information is placed in memory, only the contents of which byte locations included in the specified field are replaced, even though the width of the physical path to memory may be greater than the length of the field being stored.

Some units of information are located on integer limits in memory. For a unit of information, a bound is said to be an integer when its memory address is a multiple of the length of the unit in bytes. Special names are given to the fields of 2, 4, 6, 8 and 16 bytes on the integer limit. A halfword is a set of two consecutive bytes on a two-byte boundary and is the basic building block of instructions. A word is a set of four consecutive bytes on a four-byte boundary. A doubleword is a set of eight consecutive bytes on an eight-byte boundary. A quad word (quadword) is a set of 16 contiguous bytes on a 16-byte boundary. When a memory address specifies a halfword, a word, a doubleword, and a quadword, the binary representation of the address includes one, two, three, or four rightmost zero bits, respectively. The instruction will be on a two-byte integer boundary. Most instructions have memory operands that do not have boundary alignment requirements.

On devices that implement separate caches for instructions and data operands, significant delays may be experienced if a program stores in a cache line and an instruction is subsequently fetched from the cache line, regardless of whether the store alters the subsequently fetched instruction.

In one embodiment, the invention may be implemented by software (sometimes referred to as licensed internal code, firmware, microcode, millicode, picocode, etc., any of which would be consistent with the invention). Referring to fig. 8, software program code embodying the present invention is typically accessible by a processor of the host system 5000 from a long term storage media device 5011, such as a CD-ROM drive, tape drive or hard drive. The software program code may be embodied on any of a variety of known media for use with a data processing system, such as a floppy disk, a hard drive, or a CD-ROM. The code may be distributed on such media, or may be distributed to users of other computer systems from the computer memory 5002 or storage devices of one computer system over the network 5010 for use by users of such other systems.

The software program code includes an operating system which controls the function and interaction of the various computer components and one or more application programs. The program code is typically paged from the storage media device 5011 to the relatively higher speed computer memory 5002 where it is available to the processor 5001. The techniques and methods for embodying software program code in memory, on physical media, and/or distributing software code via networks are well known and will not be discussed further herein. When the program code is created and stored on a tangible medium, including but not limited to an electronic memory module (RAM), flash memory, Compact Discs (CDs), DVDs, tapes, etc., it is often referred to as a "computer program product". The computer program product medium is typically readable by processing circuitry preferably located in a computer system for execution by the processing circuitry.

FIG. 12 illustrates a representative workstation or server hardware system in which the present invention may be implemented. The system 5020 of fig. 12 includes a representative base computer system (base computer system) 5021, such as a personal computer, workstation or server, including optional peripherals. A basic computer system 5021 comprises one or more processors 5026 and a bus used to connect and enable communication between the processors 5026 and other components of the system 5021, in accordance with known techniques. The bus connects the processor 5026 to memory 5025 and long-term storage 5027 which may comprise a hard disk drive (including any of magnetic media, CD, DVD, and flash memory, for example) or a tape drive, for example. The system 5021 can also include a user interface adapter that connects the microprocessor 5026 via the bus to one or more interface devices, such as a keyboard 5024, a mouse 5023, a printer/scanner 5030, and/or other interface devices, which can be any user interface device, such as a touch-sensitive screen, a digital input pad (digitized entry pad), etc. The bus may also connect a display device 5022, such as an LCD screen or monitor, to the microprocessor 5026 via a display adapter.

The system 5021 may communicate with other computers or networks of computers via a network adapter capable of communicating 5028 with a network 5029. Exemplary network adapters are communications channels, token ring, Ethernet or modems. Alternatively, the system 5021 may communicate using a wireless interface, such as a CDPD (cellular digital packet data) card. The system 5021 can be associated with such other computers in a Local Area Network (LAN) or a Wide Area Network (WAN), or the system 5021 can be a client in a client/server arrangement with another computer, etc. All of these configurations, as well as suitable communication hardware and software, are known in the art.

Figure 13 illustrates a data processing network 5040 in which the present invention may be implemented. The data processing network 5040 may include a plurality of separate networks, such as wireless and wired networks, each of which may include a plurality of separate workstations 5041, 5042, 5043, 5044. Further, those skilled in the art will appreciate that one or more LANs may be included, wherein a LAN may include a plurality of intelligent workstations coupled to a host processor.

Still referring to FIG. 13, the network may also include mainframe computers or servers, such as a gateway computer (client server 5046) or application server (remote server 5048), which may access a data repository, and may also access a data repositoryAccessed directly from the workstation 5045). The gateway computer 5046 serves as a point of entry into each individual network. When connecting one networking protocol to another, a gateway is required. The gateway 5046 may preferably be coupled to another network (e.g., the internet 5047) by a communications link. The gateway 5046 may also be directly coupled to one or more workstations 5041, 5042, 5043, 5044 using a communications link. IBM eServer, available from International Business machines corporation, may be utilized^TMSystemThe server implements a gateway computer.

Referring concurrently to fig. 12 and 13, software programming code which may embody the present invention may be accessed by the processor 5026 of the system 5020 from long-term storage media 5027, such as a CD-ROM drive or hard drive. The software programming code may be embodied on any of a variety of known media for use with a data processing system, such as a floppy disk, a hard drive, or a CD-ROM. The code may be distributed on such media, or from the memory or storage of one computer system over a network to users 5050, 5051 of other computer systems for use by users of such other systems.

Alternatively, the programming code may be embodied in the memory 5025 and accessed by the processor 5026 using a processor bus. Such programming code includes an operating system which controls the function and interaction of the various computer components and one or more application programs 5032. Program code is typically paged from the storage medium 5027 to high-speed memory 5025 where it is available for processing by the processor 5026. Techniques and methods for embodying software programming code in memory, on physical media, and/or distributing software code via networks are well known and will not be discussed further herein. Program code, when created and stored on tangible media, including but not limited to electronic memory modules (RAM), flash memory, Compact Discs (CDs), DVDs, tapes, etc., is commonly referred to as a "computer program product". The computer program product medium is typically readable by a processing circuit, preferably located in a computer system, for execution by the processing circuit.

The cache most readily used by the processor (which is typically faster and smaller than the other caches of the processor) is the lowest level (L1 or level 1) cache, and main storage (main memory) is the highest level cache (L3 if there are three levels). The lowest level cache is often divided into an instruction cache (I-cache) that holds the machine instructions to be executed, and a data cache (D-cache) that holds the data operands.

Referring to FIG. 14, an exemplary processor embodiment is shown for the processor 5026. Typically, one or more levels of cache 5053 are used to buffer memory blocks in order to improve processor performance. The cache 5053 is a cache buffer that holds cache lines of memory data that are likely to be used. Typical cache lines are 64, 128 or 256 bytes of memory data. A separate cache is typically used for caching instructions rather than data. Cache coherency (synchronization of copies of lines in memory and cache) is typically provided by various "snoop" algorithms well known in the art. The main memory 5025 of the processor system is commonly referred to as a cache. In a processor system having 4 levels of cache 5053, main memory 5025 is sometimes referred to as a level 5 (L5) cache, because it is typically faster and maintains only a portion of the non-volatile storage (DASD, tape, etc.) that is available to the computer system. Main memory 5025 may "cache" pages of data paged in and out of main memory 5025 by the operating system.

Program counter (instruction counter) 5061 keeps track of the address of the current instruction to be executed.The program counter in the processor is 64 bits and may be truncated to 31 or 24 bits to support the previous addressing limits. The program counter is typically embodied in the PSW (program status word) of the computer so that it can be context sensitiveThe transition continues. Thus, an in-progress program having a program counter value may be interrupted by, for example, an operating system (context switch from a program environment to an operating system environment). When a program is inactive, the PSW of the program maintains a program counter value, and while the operating system executes, the program counter (in the PSW) of the operating system is used. Typically, the program counter is incremented by an amount equal to the number of bytes of the current instruction. RISC (reduced instruction set computing) instructions are typically of fixed length, while CISC (Complex instruction set computing) instructions are typically of variable length.Is a CISC instruction having a length of 2, 4 or 6 bytes. Program counter 5061 is modified by, for example, a context switch operation or a branch taken operation of a branch instruction. In a context switch operation, the current program counter value is saved in a program status word along with other status information about the program being executed (such as condition codes), and a new program counter value is loaded and points to the instruction of the new program module to be executed. A branch taken operation is performed to allow the program to make a decision or loop within the program by loading the result of the branch instruction into the program counter 5061.

Typically, instructions are fetched on behalf of the processor 5026 using an instruction fetch unit 5055. The fetch unit may fetch a "next sequence of instructions," a target instruction of a branch taken instruction, or a first instruction of a context-switched program. Present instruction fetch units typically use prefetch techniques to speculatively prefetch instructions based on the likelihood that the prefetched instructions will be used. For example, the fetch unit may fetch 16 bytes of instructions, including the next sequential instruction and additional bytes of further sequential instructions.

The fetched instructions are then executed by the processor 5026. In one embodiment, the fetched instructions are passed to the dispatch unit 5056 of the fetch unit. The dispatch unit decodes the instructions and forwards information about the decoded instructions to the appropriate units 5057, 5058, 5060. The execution unit 5057 will typically receive information from the instruction fetch unit 5055 regarding decoded arithmetic instructions, and will perform arithmetic operations on operands according to the opcode of the instruction. Operands are preferably provided to the execution unit 5057 from storage 5025, architectural registers 5059, or from an immediate field (immediate field) of the instruction being executed. The results of the execution, when stored, are stored in storage 5025, registers 5059, or other machine hardware (such as control registers, PSW registers, etc.).

The processor 5026 typically has one or more units 5057, 5058, 5060 for performing the function of instructions. Referring to fig. 15A, an execution unit 5057 may communicate with architected general registers 5059, decode/dispatch unit 5056, load store unit 5060, and other 5065 processor units via interface logic 5071. The execution unit 5057 may use several register circuits 5067, 5068, 5069 to hold information that the Arithmetic Logic Unit (ALU) 5066 is to operate on. The ALU performs arithmetic operations such as add, subtract, multiply, divide, and logical operations such as AND, OR, and exclusive OR (XOR), rotate, and shift. Preferably, the ALU supports specialized operations that are design dependent. Other circuitry may provide other architectural tools 5072, including condition codes and recovery support logic, for example. Typically, the results of the ALU operations are held in output register circuitry 5070, which may forward the results to a variety of other processing functions. There are many processor unit arrangements and this description is intended only to provide a representative understanding of one embodiment.

For example, ADD instructions will be executed in an execution unit 5057 having arithmetic and logical functionality, while floating point instructions will be executed in floating point execution with dedicated floating point capabilities, for example. Preferably, the execution unit operates on the operands identified by the instruction by executing the function defined by the opcode on the operands. For example, an ADD instruction may be executed by the execution unit 5057 on operands found in two registers 5059 identified by register fields of the instruction.

The execution unit 5057 performs arithmetic addition on two operands and stores the result in a third operand, where the third operand may be a third register or two source registersOne of the memories. The execution unit preferably utilizes an Arithmetic Logic Unit (ALU) 5066, which can perform a variety of logic functions, such as shifting, rotating, and, OR, and XOR, as well as any of a variety of algebraic functions, including addition, subtraction, multiplication, and division. Some ALUs 5056 are designed for scalar operations, and some for floating point. Depending on the architecture, the data may be big endien (where the least significant byte is at the most significant byte address) or little endien (where the least significant byte is at the least significant byte address).Is the large end. Depending on the architecture, the signed field may be sign and magnitude, 1's complement, or 2's complement. A 2's complement number is advantageous in that the ALU does not need to design subtraction capability because only addition in the ALU is required, whether negative or positive in the 2's complement. The numbers are typically described in shorthand, where a 12-bit field defines the address of a block of 4096 bytes, and are typically described as a 4Kbyte block, for example.

Referring to FIG. 15B, branch instruction information for executing a branch instruction is typically sent to a branch unit 5058, which often predicts branch outcome before other conditional operations are completed, using a branch prediction algorithm such as branch history table 5082. Before the conditional operation completes, the target of the current branch instruction will be fetched and speculatively executed. When the conditional operation completes, the speculatively executed branch instruction is either completed or discarded based on the condition of the conditional operation and the speculative result. Typical branch instructions may test the condition code and branch to a target address if the condition code satisfies the branch requirement of the branch instruction, the branch address may be calculated based on a number including, for example, a number found in a register field or an immediate field of the instruction. The branch unit 5058 may utilize an ALU 5074 having a plurality of input register circuits 5075, 5076, 5077 and an output register circuit 5080. The branch unit 5058 may communicate with, for example, general registers 5059, decode dispatch unit 5056, or other circuitry 5073.

Execution of a set of instructions may be interrupted for a number of reasons including, for example, a context switch initiated by the operating system, a program exception or error causing a context switch, an I/O interrupt signal causing a context switch, or multi-threaded activity of multiple programs (in a multi-threaded environment). Preferably, the context switch action saves state information about the currently executing program and then loads state information about another program being invoked. The state information may be stored, for example, in hardware registers or memory. The state information preferably includes a program counter value pointing to the next instruction to be executed, condition codes, memory translation information and architectural register contents. The context translation activities may be implemented by hardware circuitry, application programs, operating system programs, or firmware code (microcode, pico code, or Licensed Internal Code (LIC)), alone or in combination.

The processor accesses operands according to the instruction defined method. An instruction may provide an immediate operand using the value of a portion of the instruction, may provide one or more register fields that explicitly point to general purpose registers or special purpose registers (e.g., floating point registers). The instruction may utilize the implied register determined by the opcode field as an operand. The instruction may utilize memory locations for operands. The memory location of the operand may be provided by a register, an immediate field, or a combination of a register and an immediate field, such asIllustrated by the long displacement facility (facility), where the instruction defines a base register, an index register, and an immediate field (displacement field) that are added together to provide, for example, the address of an operand in memory. Location here typically means a location in main memory (main storage device) unless otherwise specified.

Referring to fig. 15C, a processor accesses a memory using a load/store unit 5060. The load/store unit 5060 may perform a load operation by obtaining the address of a target operand in memory 5053 and loading the operand into a register 5059 or other memory 5053 location, or may perform a store operation by obtaining the address of a target operand in memory 5053 and storing data obtained from a register 5059 or another memory 5053 location in the target operand location in memory 5053. The load/store unit 5060 may be speculative and may access memory in an out-of-order relative to instruction order, but the load/store unit 5060 will maintain the appearance to a program that instructions are executed in order. The load/store unit 5060 may communicate with general registers 5059, decryption/dispatch unit 5056, cache/memory interface 5053 or other elements 5083, and includes various register circuits, ALUs 5085 and control logic 5090 to calculate memory addresses and provide pipeline order to keep operations in order. Some operations may be out of order, but the load/store unit provides functionality such that operations that are performed out of order appear to the program as if they were performed in order, as is well known in the art.

Preferably, the addresses that are "seen" by the application are commonly referred to as virtual addresses. Virtual addresses are sometimes referred to as "logical addresses" and "effective addresses". These virtual addresses are virtual in that they are redirected to a physical memory location by one of a variety of Dynamic Address Translation (DAT) techniques including, but not limited to, simply prefixing the virtual address with an offset value, translating the virtual address via one or more translation tables, preferably including at least a segment table and a page table (either individually or in combination), preferably the segment table having an entry pointing to the page table. In thatA translation hierarchy is provided that includes a region first table, a region second table, a region third table, a segment table, and an optional page table. Performance of translation tables is typically improved by utilizing a Translation Lookaside Buffer (TLB) that includes entries that map virtual addresses to associated physical memory locations. When a DAT translates a virtual address using a translation table, an entry is created. Subsequent use of the virtual address may then utilize the entry of the fast TLB, rather than the slow sequential translation table access. TLB content may be managed by a plurality of replacement algorithms including LRU (least recently used).

Where the processors are processors of a multi-processor system, each processor has the responsibility of maintaining shared resources, such as I/O, caches, TLBs, and memory, which are interlocked to achieve coherency. Typically, "snooping" techniques will be used to maintain cache coherency. In a snooping environment, each cache line may be marked as being in one of a shared state, an exclusive state, a changed state, an invalid state, etc., to facilitate sharing.

The I/O unit 5054 (fig. 14) provides the processor with means for attaching to peripheral devices including, for example, tapes, disks, printers, displays, and networks. The I/O cells are typically presented to the computer program by a software driver. In a location such as fromSystem ofThe channel adapter and the open system adapter are I/O units of the mainframe computer that provide communication between the operating system and peripheral devices.

Moreover, other types of computing environments may benefit from one or more aspects of the present invention. By way of example, an environment may include an emulator (e.g., software or other emulation mechanisms), in which a particular architecture (including, for example, instruction execution, architectural functions such as address translation, and architectural registers) or a subset thereof is emulated (e.g., in a native computer system having a processor and memory). In such an environment, one or more emulation functions of the emulator can implement one or more aspects of the present invention, even though the computer executing the emulator may have a different architecture than the capabilities being emulated. As one example, in emulation mode, a particular instruction or operation being emulated is decoded, and the appropriate emulation function is established to implement the single instruction or operation.

In an emulation environment, a host computer includes, for example, memory to store instructions and data; an instruction fetch unit to fetch instructions from memory and, optionally, to provide local buffering of fetched instructions; an instruction decode unit to receive the fetched instruction and determine a type of instruction that has been fetched; and an instruction execution unit to execute the instruction. Execution may include loading data from memory to a register; storing data from the register back to the memory; or perform some type of arithmetic or logical operation as determined by the decode unit. In one example, each unit is implemented in software. For example, the operations performed by the units are implemented as one or more subroutines in emulator software.

More specifically, in a mainframe computer, programmers (typically today's "C" programmers) typically use architected machine instructions through compiler applications. The instructions stored in the storage medium may be inEither locally in a server or in a machine executing other architectures. They may be present and futureMainframe computer server andother machines (e.g., Power Systems servers and Systems)Server) is simulated. They can be used byAMD^TMRunning Linux on various machines of hardware manufactured by the same manufacturerIs executed in a machine. Except that atWith this hardware on execution, Linux can also be used for machines that use emulation provided by Hercules or FSI (Fundamental Software, Inc.), where execution is typically in emulation mode. In emulation mode, emulation software is executed by the native processor to emulate the architecture of the emulated processor. Information relating to the simulator products referenced above is available on the world wide web www.hercul es-390.org and www.funsoft.com, respectively.

The native processor typically executes emulation software, which includes firmware or a native operating system, to execute an emulation program of the emulated processor. The emulation software is responsible for fetching and executing instructions of the emulated processor architecture. The emulation software maintains an emulated program counter to keep track of instruction boundaries. The emulation software can fetch one or more emulated machine instructions at a time and convert the one or more emulated machine instructions into a corresponding set of native machine instructions for execution by the native processor. These translated instructions may be cached so that faster translations may be accomplished. Nevertheless, the emulation software will maintain the architectural rules of the emulated processor architecture to ensure that the operating system and applications written for the emulated processor operate correctly. Furthermore, the emulation software will provide resources determined by the emulated processor architecture, including but not limited to control registers, general purpose registers, floating point registers, dynamic address translation functions including, for example, segment and page tables, interrupt mechanisms, context translation mechanisms, time of day (TOD) clocks, and architectural interfaces to the I/O subsystem, such that operating systems or applications designed to run on the emulated processor may run on the native processor with the emulation software.

The particular instruction being emulated is decoded and a subroutine is called to perform the function of that single instruction. The emulation software functions that emulate the functions of an emulated processor are implemented, for example, in a "C" subroutine or driver, or by other methods that provide drivers for specific hardware, as will be understood by those skilled in the art after understanding the description of the preferred embodiments. Including, but not limited to, U.S. patent No. 5,551,013 entitled "Multiprocessor for Hardware Emulation" to beaussoleil et al; and U.S. patent certificate number 6,009,261 entitled "Preprocessing of Stored Target Instructions for simulating an incorporated instruction on a Target Processor" to Scalazi et al; and U.S. patent document No. 5,574,873 entitled "Decoding Guest Instructions to direct Access orientations angles of the Guest Instructions" to Davidian et al; and U.S. patent certificate No. 6,308,255 entitled "symmetric Multiprocessing Bus and chip Used for multiprocessor support Non-Native Code to Run in a System" to Gorishek et al; and U.S. patent document No. 6,463,582 entitled "Dynamic Optimizing Object code translator for Architecture implementation and Dynamic Optimizing Object code Translation Method" by Lethin et al; and U.S. patent certificate No. 5,790,825 entitled "Method for simulating Guest instruments" by Eric Traut for Host computer through Dynamic reconfiguration of Host instruments "; as well as numerous other patents, show various known ways to implement emulation of instruction formats architected for different machines for a target machine available to those skilled in the art.

In fig. 16, an example of an emulated host computer system 5092 is provided that emulates a host computer system 5000' of a host architecture. In the emulated host computer system 5092, the host processor (CPU) 5091 is an emulated host processor (or virtual host processor) and includes an emulated processor 5093 having a different native instruction set architecture than the processor 5091 of the host computer 5000'. The emulation host computer system 5092 has a memory 5094 accessible by an emulation processor 5093. In the exemplary embodiment, memory 5094 is partitioned into a host computer memory 5096 portion and an emulation routines 5097 portion. Host computer memory 5096 is available to programs emulating host computer 5092, according to the host computer architecture. The emulation processor 5093 executes native instructions of an architected instruction set of a different architecture than the emulated processor 5091 (i.e., native instructions from the emulated program processor 5097), and may access host instructions for execution from programs in the host computer memory 5096 by using one or more instructions obtained from a sequence and access/decode routine that may decode the accessed host instructions to determine a native instruction execution routine for emulating the function of the accessed host instructions. Other tools defined for the host computer system 5000' architecture may be emulated by the architecture tool routines, including such tools as general purpose registers, control registers, dynamic address translation and I/O subsystem support and processor caches. The emulation routine may also take advantage of functions available in the emulation processor 5093, such as dynamic translation of general purpose registers and virtual addresses, to improve the performance of the emulation routine. Specialized hardware and offload engines may also be provided to assist the processor 5093 in emulating the functionality of the host computer 5000'.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or components.

The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below, if any, are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A method of executing a load instruction for loading data from an adapter, comprising the steps of:

obtaining a machine instruction for execution, the machine instruction defined for computer execution according to a computer architecture, the machine instruction comprising:

identifying a loaded opcode field from the adapter instruction;

a first field identifying a first location from which data retrieved from an adapter is to be loaded;

a second field identifying a second location, the contents of the second field including a function handle identifying the adapter, a designation of an address space within the adapter from which data is loaded, and an offset within the address space; and

executing the machine instruction, the executing comprising:

obtaining a function table entry associated with the adapter using the function handle;

obtaining a data address of the adapter using at least one of information in a function table entry and the offset; and

data is retrieved from a particular location in the address space identified by the designation of the address space, wherein the particular location is identified by the data address of the adapter.

2. The method of claim 1, wherein the address space to be accessed is one of a memory space or an I/O space, and wherein obtaining the data address comprises: the data address is obtained using one or more parameters of a function table entry.

3. The method of claim 2, wherein the obtaining the data address using one or more parameters of a function table entry comprises: and adding the value of the base address register of the functional table entry and the offset to obtain the data address.

4. The method of claim 1, wherein the address space to be accessed is a configuration space, and wherein the data address is an offset provided by an instruction, the offset identifying a register number in the configuration space.

5. The method of claim 1, wherein the executing further comprises placing data to a first location specified by an instruction.

6. The method of claim 5, wherein the content of the second location comprises an amount of data to be acquired.

7. The method of claim 1, wherein the instructions are executed based on an adapter architecture.

8. A computer system that executes a load instruction for loading data from an adapter, the computer system comprising:

a machine instruction obtaining component for obtaining a machine instruction for execution, the machine instruction defined for computer execution according to a computer architecture, the machine instruction comprising:

identifying a loaded opcode field from the adapter instruction;

an execution component for executing the machine instruction, comprising:

a function table entry obtainer for obtaining a function table entry associated with the adapter using the function handle;

a data address obtainer for obtaining a data address of the adapter using at least one of the information in the function table entry and the offset; and

a data retrieval component for retrieving data from a particular location in the address space identified by the designation of the address space, wherein the particular location is identified by the data address of the adapter.

9. The computer system of claim 8, wherein the address space to be accessed is one of a memory space or an I/O space, and wherein obtaining the data address comprises: the data address is obtained using one or more parameters of a function table entry.

10. The computer system of claim 9, wherein the data address obtainer for obtaining the data address of the adapter using at least one of the offset and information in the function table entry comprises: means for adding the value of the base address register of the function table entry to the offset to obtain the data address.

11. The computer system of claim 8, wherein the address space to be accessed is a configuration space, and wherein the data address is an offset provided by an instruction, the offset identifying a register number in the configuration space.

12. The computer system of claim 8, wherein the execution component for executing the machine instruction further comprises means for placing data to a first location specified by an instruction.

13. The computer system of claim 12, wherein the content of the second location comprises an amount of data to be retrieved.

14. The computer system of claim 8, wherein the instructions are executed based on an adapter architecture.