SCSIport is the Microsoft® Windows® system-supplied storage port driver, designed to manage SCSI transport on parallel SCSI interconnects. During the StartIo routine, the SCSIport driver translates the I/O request packet into a SCSI request block (SRB) and queues it to the miniport driver, which decodes the SRB and translates it into the specific format required by the storage controller. The start I/O phase of the request, which includes build and start (see Figure 2), takes microseconds; hundreds of SRBs can be queued, ready to be processed before even a single I/O request is completed. In fact, the longest phase of the I/O request is the data seek time (latency); if the data is not in cache, finding the correct blocks on the physical disk can take several milliseconds. (Note that the diagram shows relative time, not actual time units.) Once the hardware processes the I/O request—that is, does the data transfer—the controller generates a hardware interrupt indicating that the I/O has been completed. The interrupt is, in turn, processed by the HwInterrupt miniport routine (indicated as ISR or Interrupt Service Routine in the diagrams), which receives the completed requests and begins the whole process again. Data transfers are performed by the hardware itself (using Direct Memory Access or DMA) without operating system intervention.
Figure 2. Phases of an I/O Request (not to scale, relative durations shown)
While the SCSIport driver is an effective solution for storage via parallel SCSI, it was not designed for either Fibre Channel or hardware RAID, and when used with these adapters, the full capabilities of the high performance interconnect cannot be realized. The nature of the performance limitations and their causes are detailed in the following sections.
Adapter I/O Limit
SCSIport can support a maximum of 254 outstanding I/O requests per adapter. Since each adapter may provide support for multiple SCSI devices, all the devices sharing that adapter must share the maximum of 254 outstanding requests. With a parallel SCSI bus that is designed to support a maximum of 15 attached devices, this may not be a problem; typically each SCSI physical disk corresponds to a single logical unit.
In Fibre Channel environments, however, the number of target devices each adapter is designed to support is much higher. Fibre Channel arbitrated loop configurations can support 126 devices (hosts and disks). Switch configurations can theoretically support up to 16 million devices. Even without this level of device complexity, using the SCSIport driver, with its limit of 254 outstanding I/O requests per adapter, can be a significant bottleneck in Fibre Channel environments because each disk device commonly maps to multiple logical units (potentially thousands).
At any given time, the SCSIport driver supports either the issuing or the completion of I/O requests, but not simultaneous execution of both request functions. In other words, once an I/O request enters the StartIo routine and the SRB is sent to a host bus adapter (HBA), this transmission mode (sometimes rather misleadingly called “half duplex”) prevents the adapter from processing storage device interrupts until the start I/O processing phase is complete.
Conversely, once the miniport interrupt processing begins, new I/O requests packets (IRPs) are blocked from being issued (although those I/O requests already being processed can be completed). Only after the interrupt has been received and completely processed by the miniport/port driver, can new I/O requests be started.
Figure 3 demonstrates how the SCSIport driver handles multiple I/O requests and interrupts. I/O request processing is sequential—the second IRP cannot be started until the first reaches the end of the start I/O routine and is sent off to the hardware. Likewise, interrupt service requests arriving after IRPs have entered the start I/O routine are queued (in the order in which they are received) and must wait until the in-progress IRPs are ready to enter the next stage of processing, such as queuing or handing off to the miniport.
Figure 3. SCSIport: Sequential I/O Functioning
In single processor systems, the SCSIport requirement that the start I/O routine be synchronized with the interrupt service routine—so that only one of these routines can execute at any one time—has negligible impact. In multiprocessor systems, however, the impact is considerable. Although up to 64 processors may be available for I/O processing, SCSIport cannot exploit the parallel processing capabilities. The net result is considerably more I/O processing time than would be required if start I/Os and interrupts could be executed simultaneously rather than sequentially.
Increased Miniport Load at Elevated IRQLs
The SCSIport driver was designed to do the majority of the work necessary to translate each I/O request packet into corresponding SCSI request blocks (SRBs); the miniport driver only does minimal additional processing during its HWStartIo routine. Its primary responsibility is to acknowledge receipt of the built SRB, construct the list (scatter/gather list) of memory addresses in a format the hardware can use, and transmit the SRB and scatter/gather list to the to the hardware. However, when some non-SCSI adapters are used with SCSIport, the SRB may require additional translation to non-SCSI protocols. These additional steps must be performed by the miniport driver during its HwStartIO routine.
The HwStartIo routine always executes with the interrupt request level (IRQL) of the processor at the same priority level as the interrupt request of the device. Because all interrupts with the same or lower priority are masked to enable a higher priority process to complete without interruption, the elevated IRQL means that hardware interrupts accumulate rather than being processed. With parallel SCSI adapters, this has minimal impact, since there is very little additional work for the miniport driver to do. However, when using Fibre Channel or hardware RAID adapters, the workload on the miniport driver is much heavier; as a consequence, considerably more time is spent at an elevated IRQL. The net result of high numbers of accumulated interrupts is degraded system performance.
Data Buffer Processing Overhead
To correctly process an I/O request, the miniport driver must pass physical addresses that correspond to the IRP’s data buffer (called a scatter/gather list) to the adapter. The current architecture of the SCSIport driver requires that the miniport driver repeatedly call the port driver to access this information, one element at a time, rather than obtain a complete list all at once. This repeated calling is CPU and time-intensive, and becomes inefficient with large data transfers or when memory becomes fragmented.
SCSIport maintains two types of I/O request queues, a device queue for each SCSI device on a controller, and an adapter queue for each SCSI controller. The SCSIport queue does not provide an explicit method for miniport drivers to control the way items are queued to their devices. This is problematic for complex Fibre Channel configurations that must be able to pause and resume queues correctly in the event that a link connection goes down.
Device queue. In a multiprocessor system, a mechanism is necessary to synchronize access to each storage device so that different processor requests (such as writing to a file or updating a database) are prevented from simultaneous access to the adapter. One mechanism of doing this is to use a spinlock. Only the processor in possession of the spinlock (a synchronization function) can make changes to the hardware; the other processors in the device queue are held in a wait mode, “spinning” until the next processor in the queue can acquire the spinlock and proceed with its task.
Adapter Queue. The I/O requests passed down through the driver layers encounter the spinlock just above the miniport layer. From there, the requests are passed to each adapter, as shown in Figure 4.
Figure 4. Successively More Restrictive Queuing in the SCSIport Driver Model
The drawbacks to the SCSIport queuing process are several. First, each I/O request must queue for access to a spinlock not just once, but twice. Second, the adapter queue restricts I/O throughput to a maximum of 254 requests per adapter, on a first in-first out (FIFO) basis. For high performance adapters, which can process thousands of requests at a time, this can be a serious performance limitation. And third, SCSIport does not provide any means by which to manage device queues to improve performance under conditions of high load or to temporarily suspend I/O processing without accumulating errors. A consequence of this is that a busy device monopolizes the adapter queue while other devices might be able to respond without delay.
Figure 5 presents a baseline of SCSIport performance. Note that the time “units” presented in this figure are for illustrative purposes only and not intended to indicate actual times. Since I/O functioning is sequential in nature, I/O requests in process must complete without interruption. Interrupt service requests are queued and not processed until a suitable break in I/O processing. The benefits of multiprocessing are not realized.
Figure 5. SCSIport I/O Performance
This baseline provides the point of comparison with Storport functioning and performance, as discussed in the Storport section.