Interrupt Affinity
Interrupt affinity means binding of interrupts from a specific device to specific processor(s) in a multiprocessor server. This enforces running the ISR and DPC routines on the said processor(s). Because network connections and file server sessions all stay on the same network adapter, binding interrupts from the network adapter to a processor allows for processing incoming packets (SMB requests, data) on a specific set of processors, improving locality and scalability. You cannot configure affinity on single-processor computers.
The Interrupt-Affinity Filter (IntFiltr) tool allows you to change the CPU-affinity of the interrupts in a system.
Using this utility, you can direct any device's interrupts to a specific processor or set of processors (as opposed to always sending interrupts to any of the CPUs in the system). Note that different devices can have different interrupt-affinity settings. This utility will work on any server running Windows Server 2003, regardless of what processor or interrupt controller is used.
General Tuning Parameters for Client Computers DormantFileLimit
HKLM\system\CurrentControlSet\Services\lanmanworkstation\parameters\ (REG_DWORD)
By default this registry key is not created. (Windows XP client computers only.)
Specifies the maximum number of files that should be left open on a share after the application has closed the file.
ScavengerTimeLimit
HKLM\system\CurrentControlSet\Services\lanmanworkstation\parameters\ (REG_DWORD)
Windows XP client computers only.
The amount of time in seconds the redirector waits before it starts scavenging dormant file handles (cached file handles that are not currently used by any application).
DisableByteRangeLockingOnReadOnlyFiles
HKLM\System\CurrentControlSet\Services\LanmanWorkStation\Parameters\ (REG_DWORD)
Windows XP client computers only.
Some distributed applications that lock portions of a read-only file as synchronization across clients require that file-handle caching and collapsing behavior be off for all read-only files. This parameter can be set if such applications will not be run on the system and collapsing behavior can be enabled on the client computer.
TcpAckFrequency
Note: TcpAckFrequency applies only to .XP Clients . The recommended setting for TcpAckFrequency is between one-third and one-half of TcpWindowSize.
For Gigabit cards:
HKLM\system\CurrentControlSet\Services\Tcpip\Parameters\Interfaces
For each Gigabit adapter, add:
TcpAckFrequency (REG_DWORD) = 13 (decimal)
By default this entry is not in the registry.
If only acking data and not any control packets, ack once every 13 packets, instead of the default of 2. This helps reducing packet processing costs for the network stack in the case of large writes (uploads) from the client into the server.
For FastEthernet cards:
HKLM\system\CurrentControlSet\Services\Tcpip\Parameters\Interfaces
For each FastEthernet adapter, add:
TcpAckFrequency (REG_DWORD) = 5 (decimal)
By default this entry is not in the registry. If only acking data and not any control packets, ack once every 5 packets, instead of the default of 2. This helps reducing packet processing costs for the network stack in the case of large writes (uploads) from the client into the server.
Large Active Directory® environments have a few special tuning requirements.
Using the /3GB Switch in the Boot.ini file
On server computers, a large quantity of memory is helpful in reducing disk I/O activity. Use of the /3GB switch gives x86 servers more user mode virtual space and allows Active Directory to improve its caching.
Windows 2000 includes two settings:
-
Using the /3GB switch allows the main Active Directory cache a maximum of 1024 MB.
-
Without the /3GB switch, the main Active Directory cache is limited to 512 MB.
For Windows Server 2003, the Active Directory cache is allowed to grow more freely but remains limited by virtual address space.
Turning Off Signing and Sealing
Client computers running Windows XP with Service Pack 1 (SP1) and higher, and servers running Windows Server 2003 are capable of signing and sealing for improved security, and this is enabled by default. Windows 2000 clients do not enable signing and sealing by default, although Windows 2000 with Service Pack 3 (SP3) has the option to turn it on. Production environments with a secured network do not require this setting to be enabled. The Windows Server 2003 family of operating systems provides an option for disabling signing and sealing. You can find this setting at:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ldap\ ldapclientintegrity = REG_DWORD 0x0
Benchmarking Web Workloads (WebBench)
Consider using the following guidelines for benchmarking Web workloads
-
Isolate the IIS server and other related computers from corporate network traffic.
-
Allow sufficient warm-up time to get to a steady state.
-
Synchronize client clocks with the IIS server clock to ensure proper benchmarking of time-dependent requests.
-
For best performance, turn all recycling, performance, and power options off in the IIS Admin UI, unless you encounter an acute situation where these options may help. For more information, see Performance Tuning for IIS 6.0 earlier in this document.
-
If using SSL, select a reasonable and consistent key size.
WebBench 4.1 provides a way to measure the performance of Web servers. WebBench uses client computers to simulate Web browsers. However, unlike actual browsers, the clients don't display the files that the server sends in response to their requests. Instead, when a client receives a response from the server, it records the information associated with the response and then immediately sends another request to the server.
The following three tables list high-end and low-end server settings and client computer tuning parameters.
Table 12. High-End Server Settings
Type
|
Setting
|
IIS settings
| -
Registry (under HKLM\System\CurrentControlSet/Services\Inetinfo\Parameters\)
-
MaxCachedFileSize (REG_DWORD) 1048576
-
IIS Metabase (under W3SVC/)
-
Use central binary logging by setting CentralBinaryLoggingEnabled = TRUE
-
SSL tuning parameters: Key size 1024 bytes. For competitive benchmarking, use the same key size for all servers.
|
Http.sys settings
| -
Registry (under HKLM\System\CurrentControlSet/Services\HTTP\Parameters\)
UriMaxUriBytes (REG_DWORD) 1048576 (largest file in the set).
|
NTFS File System setting
| -
Registry (under HKLM\System\CurrentControlSet\Control\FileSystem\)
NtfsDisableLastAccessUpdate (REG_DWORD) 1
|
TCPIP.SYS performance settings for IIS
| -
Registry (under HKLM\System\CurrentControlSet\Services\tcpip\parameters\)
MaxHashTableSize (REG_DWORD) 0xffff
See also Performance Tuning for Networking earlier in this document.
|
Network adapter tuning and binding for IIS
| -
Each network adapter bounded to a CPU.
See also Performance Tuning for Networking earlier in this document.
|
Characteristics of low-end server settings include the following:
-
Single processor, single network adapter.
-
Limited physical memory—at least 256 MB; typically 512 MB RAM.
-
Paging activity expected.
-
Not recommended in the case of large number of ASP files or for memory-heavy dynamic content.
Type
|
Setting
|
IIS settings
| -
Registry (under HKLM\System\CurrentControlSet/Services\Inetinfo\Parameters\ )
MaxCachedFileSize (REG_DWORD) 1048576
MemCacheSize (REG_DWORD) 10
-
IIS Metabase (under W3SVC/)
Use central binary logging by setting
CentralBinaryLoggingEnabled = TRUE
|
Http.sys settings
| -
Registry (under HKLM\System\CurrentControlSet\Services\http\parameters\)
UriMaxUriBytes (REG_DWORD) 1048576
RequestBufferLookasideDepth (REG_DWORD) 256
InternalRequestLookasideDepth (REG_DWORD) 256
LargeMemMegabytes (REG_DWORD) 150
|
NTFS file system setting
| -
Registry (under HKLM\System\CurrentControlSet\Control\FileSystem\)
NtfsDisableLastAccessUpdate (REG_DWORD) 1
|
Table 14. Client Computer Tuning Parameters
Type
|
Setting
|
My Computer Performance Settings
| |
TCPIP.SYS performance settings for IIS
| -
Registry (under HKLM\System\CurrentControlSet\Services\tcpip\parameters\)
MaxUserPort (REG_DWORD) 0xfffe
MaxHashTableSize (REG_DWORD) 0xffff
TcpWindowSize (REG_DWORD) 65536 (make the registry change on clients equipped with 100 BaseT Ethernet network adapters)
See also Performance Tuning for Networking in this document.
|
Benchmarking File Server Workload (NetBench)
NetBench 7.02 is a eTesting Labs benchmark program that lets you measure the performance of file servers as they handle network file requests from clients. NetBench provides you with an overall I/O throughput score and average response time for your server and individual scores for the client computers. You can use these scores to measure, analyze, and predict how well your server can handle file requests from clients. The data volumes are always formatted between tests to flush and clean up the working set to ensure a fresh start. For improved performance and scalability, it is recommended that client data be partitioned over multiple data volumes.
Registry Tuning Parameters for NetBench on Windows Server 2003
Key
|
Setting
|
HKLM\System\CurrentControlSet\Control\SessionManager\
MemoryManagement\
|
PagedPoolSize = 192000000 (decimal) (default=0)
|
HKLM\System\CurrentControlSet\Control\FileSystem\
|
NtfsDisable8dot3NameCreation = 1 (default is 0)
Add Disablelastaccess = 1
By default this registry key is not created.
|
HKLM\System\CurrentControlSet\Services\Tcpip\Parameters\
|
Add NumTcbTablePartitions = 8
By default this registry key is not created.
|
HKLM\System\CurrentControlSet\Services\Tcpip\Parameters\
Interfaces\
|
Add TcpAckFrequency (REG_DWORD) = 13 (decimal) for each Gigabit network adapter.
By default this registry key is not created. For FastEthernet adapters set this parameter to 5.
|
Registry Tuning Parameters for NetBench on Client Computers
Key
|
Setting
|
HKLM\System\CurrentControlSet\Services\
LanmanWorkStation\Parameters\
|
DisableByteRangeLockingOnReadOnlyFiles = 1; Windows XP client computers
|
HKLM\system\CurrentControlSet\Services\Tcpip\Parameters\
Interfaces\
|
Add TcpAckFrequency = 13 (decimal) for each Gigabit network adapter.
By default this registry key is not created. For FastEthernet adapters set this parameter to 5.
|
HKLM\system\CurrentControlSet\Services\lanmanworkstation\parameters\
|
Add DormantFileLimit = 100 (decimal).
By default this registry key is not created; Windows XP client computers.
|
HKLM\System\CurrentControlSet\Services\lanmanworkstation\parameters\
|
ScavengerTimeLimit = 100 (decimal); Windows XP client computers.
|
Benchmarking Active Directory Workload (DirectoryMark)
The following tunings are useful for benchmarking the DirectoryMark Workload. DirectoryMark tests are best run from a powerful client to a big server. This allows the operator to start large numbers of threads, and still receive central data reports. This setup requires a Gigabit network adaptor, approximately equal-powered clients and servers, and a server with at least 2 GB of memory.
Add Index for Description Attribute (Server)
Use schema editor to add an index for the description attribute, which is used in the DirectoryMark Addressing and Messaging Search Mixes.
Turn Off Auto Defragmenter
On a server, Auto Defragmenter starts 15 minutes after startup, runs for an hour, and then restarts every 12 hours. Benchmarking environments require reproducible results, so it is recommended that Auto Defragmenter be turned off to avoid any possible interference with a running benchmark. The defragmenter pass can be seen in the event logs if the Auto Defragmenter remains enabled.
The following registry parameter is used for turning off Auto Defragmenter:
HKLM\SYSTEM\CurrentControlSet\Services\NTDS\Parameters\DSA Heuristics = REG_SZ 000001
Increase MaxUserPorts and TcpWindowSize in TCP/IP
Heavy use of LDAP Binds requires extensive use of dynamic ports. TCP is required on server and client computers to hold these ports open for a few minutes, thus requiring more MaxUserPorts available than are actually used.
You can adjust the following registry parameters:
HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\MaxUserPort = REG_DWORD 0xfffe
HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpWindowSize = REG_DWORD 0xffff
Benchmarking Networking Workloads (Ttcp, Chariot) Tuning for NTttcp
NTttcp is a Winsock–based port of ttcp to Windows. It helps measure network driver performance and throughput on different network topologies and hardware setups. It provides the customer with a multithreaded, asynchronous performance benchmark for measuring achievable data transfer rate on an existing network setup.
Options:
-
A single thread should suffice for optimal throughput.
-
Multiple threads are needed only in the case of single to many clients.
-
Posting enough user receive buffers (using the “-a” option) alleviates TCP copying.
-
You should not excessively post user receive buffers, because the first ones posted would return before you have the need to use other buffers.
-
It’s best to bind each set of threads to a processor (second delimited parameter in “-m” option).
-
Each thread creates a socket that connects (listens) on a different port.
Table 15. Example Syntax for NTttcpSender and Receiver
Syntax
|
Details
|
Example Syntax for a Sender
NTttcps –m 1,0,10.1.2.3 –a 2
| -
Single thread
-
Bound to CPU 0
-
Connecting to computer with IP 10.1.2.3
-
Posting two send overlapped buffers
-
Default buffer size: 64 KB
-
Default buffer number: 20 KB
|
Example Syntax for a Receiver
NTttcpr –m 1,0,10.1.2.3 –a 6 –t 1000
| -
Single thread
-
Bound to CPU 0
-
Connecting to computer with IP 10.1.2.3
-
Posting two send overlapped buffers
-
Default buffer size: 64 KB
-
Default buffer number: 20 KB
|
Network Adapter
Make sure you enable all offloading features.
TCP
Set TcpWindowSize to something larger than the default value for Gigabit Ethernet (64 KB) only if you have a large bandwidth-delay product.
For example, using the Intel MT Gigabit card on a LAN, you would leave all network adapter and TCP settings at their default values for NTttcp.
-
The Intel MT network adapter offloads LSO and checksum (send as well as receive) by default.
-
The Intel MT network adapter adaptively manages its resources and you will not need to change any network adapter resource values.
-
Coalesce Buffers is not exposed, but the default interrupt moderation scheme works well.
Tuning for Chariot
Chariot is a networking workload generator from NetIQ. It stresses the network to help predict networked application performance.
The High_Performance_Throughput script workload of Chariot may be used to simulate the NTttcp workload. The tuning considerations for this workload would the same as those for NTttcp.
Related Links
See the following resources for further information:
-
Transaction Processing Performance Council Web site at www.tpc.org.
-
Lab Report: Windows Server 2003 Outperforms Predecessors at http://www.microsoft.com/windowsserver2003/evaluation/performance/etest.mspx.
-
Performance and Scalability on the Windows Server 2003 Web site at http://www.microsoft.com/windowsserver2003/evaluation/performance/default.mspx.
For the latest information about Windows Server 2003, see the Windows Server 2003 Web site at http://www.microsoft.com/windowsserver2003.
|