• Receive-Side Scaling (RSS)
  • Message-Signaled Interrupts (MSI/MSI-X)
  • Network Adapter Resources
  • Suggested Network Adapter Features for Server Roles
  • Tuning the Network Adapter
  • Enabling Offload Features
  • Increasing Network Adapter Resources
  • Enabling Interrupt Moderation
  • Enabling RSS for Web Scenarios
  • Binding Each Adapter to a CPU
  • Performance Tuning Guidelines for Windows Server 2008 R2 October 15, 2010




    Download 0.66 Mb.
    bet5/19
    Sana26.12.2019
    Hajmi0.66 Mb.
    #5293
    1   2   3   4   5   6   7   8   9   ...   19

    Choosing a Network Adapter


    Network-intensive applications require high-performance network adapters. This section covers some considerations for choosing network adapters.

    Offload Capabilities


    Offloading tasks can reduce CPU usage on the server, which improves overall system performance. The Microsoft network stack can offload one or more tasks to a network adapter if you choose one that has the appropriate offload capabilities. Table 5 provides more details about each offload capability.

    Table 5. Offload Capabilities for Network Adapters



    Offload type

    Description

    Checksum calculation

    The network stack can offload the calculation and validation of both Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) checksums on sends and receives. It can also offload the calculation and validation of both IPv4 and IPv6 checksums on sends and receives.

    IP security authentication and encryption

    The TCP/IP transport can offload the calculation and validation of encrypted checksums for authentication headers and Encapsulating Security Payloads (ESPs). The TCP/IP transport can also offload the encryption and decryption of ESPs.

    Segmentation of large TCP packets

    The TCP/IP transport supports Large Send Offload v2 (LSOv2). With LSOv2, the TCP/IP transport can offload the segmentation of large TCP packets to the hardware.

    TCP stack

    The TCP offload engine (TOE) enables a network adapter that has the appropriate capabilities to offload the entire network stack.

    Receive-Side Scaling (RSS)


    Windows Server 2008 R2 supports Receive Side Scaling (RSS) out of the box, as does Windows Server 2008. RSS distributes incoming network I/O packets among processors so that packets that belong to the same TCP connection are on the same processor, which preserves ordering. This helps improve scalability and performance for receive-intensive scenarios that have fewer networking adapters than available processors. Research shows that distributing packets to logical processors that share the same physical processor (for example, hyper-threading) degrades performance. Therefore, packets are only distributed across physical processors. Windows Server 2008 R2 offers the following optimizations for improved scalability with RSS:

    • NUMA awareness.

    RSS considers NUMA node distance (latency between nodes) when selecting processors for load balancing incoming packets.

    • Improved initialization and processor selection algorithm.

    At boot time, the Windows Server 2008 R2 networking stack considers the bandwidth and media connection state when assigning CPUs to RSS-capable adapters. Higher-bandwidth adapters get more CPUs at startup. Multiple NICs with the same bandwidth receive the same number of RSS CPUs.

    • More control over RSS on a per-NIC basis.

    Depending on the scenario and the workload characteristics, you can use the following registry parameters to choose on a per-NIC basis how many processors can be used for RSS, the starting offset for the range of processors, and which node the NIC allocates memory from:

    *MaxRSSProcessors

    HKLM\system\CurrentControlSet\Control\class\{XXXXX72-XXX}\\(REG_SZ)


    The maximum number of RSS processors assigned to each NIC.

    *RssBaseProcNumber

    HKLM\system\CurrentControlSet\Control\class\{XXXXX72-XXX}\\(REG_SZ)


    The first processor in the range of RSS processors assigned to each NIC.

    *NumaNodeID

    HKLM\system\CurrentControlSet\Control\class\{XXXXX72-XXX}\\(REG_SZ)


    The NUMA node each NIC can allocate memory from.

    Note: The asterisk (*) is part of the registry parameter.

    For more information about RSS, see the document about Scalable Networking in "Resources" later in this guide.

    Message-Signaled Interrupts (MSI/MSI-X)


    Network adapters that support MSI/MSI-X can target their interrupts to specific processors. If the adapters also support RSS, then a processor can be dedicated to servicing interrupts and DPCs for a given TCP connection. This preserves the cache locality of TCP structures and greatly improves performance.

    Network Adapter Resources


    A few network adapters actively manage their resources to achieve optimum performance. Several network adapters let the administrator manually configure resources by using the Advanced Networking tab for the adapter. For such adapters, you can set the values of a number of parameters including the number of receive buffers and send buffers.

    Interrupt Moderation


    To control interrupt moderation, some network adapters either expose different interrupt moderation levels, or buffer coalescing parameters (sometimes separately for send and receive buffers), or both. You should consider buffer coalescing or batching when the network adapter does not perform interrupt moderation. Interrupt Moderation helps reduce overall CPU utilization by minimizing per-buffer processing cost, but the moderation of interrupts and buffer batching can have a negative impact on latency-sensitive scenarios.

    Suggested Network Adapter Features for Server Roles


    Table 6 lists high-performance network adapter features that can improve performance in terms of throughput, latency, or scalability for some server roles.

    Table 6. Benefits from Network Adapter Features for Different Server Roles



    Server role

    Checksum offload

    Segmentation offload

    TCP offload engine (TOE)

    Receive-side scaling (RSS)

    File server

    X

    X

    X

    X

    Web server

    X

    X

    X

    X

    Mail server (short-lived connections)

    X







    X

    Database server

    X

    X

    X

    X

    FTP server

    X

    X

    X




    Media server

    X




    X

    X


    Disclaimer: The recommendations in Table 6 are intended to serve as guidance only for choosing the most suitable technology for specific server roles under a deterministic traffic pattern. User experience can be different, depending on workload characteristics and the hardware that is used.

    If your hardware supports TOE, then you must enable that option in the operating system to benefit from the hardware’s capability. You can enable TOE by running the following command:



    netsh int tcp set global chimney = enabled

    Tuning the Network Adapter


    You can optimize network throughput and resource usage by tuning the network adapter, if any tuning options are exposed by the adapter. Remember that the correct tuning settings depend on the network adapter, the workload, the host computer resources, and your performance goals.

    Enabling Offload Features


    Turning on network adapter offload features is usually beneficial. Sometimes, however, the network adapter is not powerful enough to handle the offload capabilities at high throughput. For example, enabling segmentation offload can reduce the maximum sustainable throughput on some network adapters because of limited hardware resources. However, if the reduced throughput is not expected to be a limitation, you should enable offload capabilities even for such network adapters. Note that some network adapters require offload features to be independently enabled for send and receive paths.

    Increasing Network Adapter Resources


    For network adapters that allow for the manual configuration of resources such as receive and send buffers, you should increase the allocated resources. Some network adapters set their receive buffers low to conserve allocated memory from the host. The low value results in dropped packets and decreased performance. Therefore, for receive-intensive scenarios, we recommend that you increase the receive buffer value to the maximum. If the adapter does not expose manual resource configuration, then it either dynamically configures the resources or it is set to a fixed value that cannot be changed.

    Enabling Interrupt Moderation


    To control interrupt moderation, some network adapters expose different interrupt moderation levels, buffer coalescing parameters (sometimes separately for send and receive buffers), or both. You should consider interrupt moderation for CPU-bound workloads and consider the trade-off between the host CPU savings and latency versus the increased host CPU savings because of more interrupts and less latency. If the network adapter does not perform interrupt moderation but does expose buffer coalescing, then increasing the number of coalesced buffers allows for more buffers per send or receive, which improves performance.

    Enabling RSS for Web Scenarios


    RSS can improve Web scalability and performance when there are fewer NICs than processors on the server. When all the Web traffic is going through the RSS-capable NICs, incoming Web requests from different connections can be simultaneously processed across different CPUs. It is important to note that due to logic in RSS and HTTP for load distribution, performance can be severely degraded if a non-RSS-capable NIC accepts Web traffic on a server that has one or more RSS-capable NICs. We recommend that you either use RSS-capable-NICs or disable RSS from the Advanced Properties tab. To determine whether a NIC is RSS-capable, view the RSS information in the Advanced Properties tab for the device.

    Binding Each Adapter to a CPU


    The method to use for binding network adapters to a CPU depends on the number of network adapters, the number of CPUs, and the number of ports per network adapter. Important factors are the type of workload and the distribution of the interrupts across the CPUs. For a workload such as a Web server that has several networking adapters, partition the adapters on a processor basis to isolate the interrupts that the adapters generate.


    Download 0.66 Mb.
    1   2   3   4   5   6   7   8   9   ...   19




    Download 0.66 Mb.

    Bosh sahifa
    Aloqalar

        Bosh sahifa



    Performance Tuning Guidelines for Windows Server 2008 R2 October 15, 2010

    Download 0.66 Mb.