Performance Tuning for File Server Workload (NetBench)
NetBench 7.02 is an eTesting Labs workload that measures the performance of file servers as they handle network file requests from clients. NetBench gives you an overall I/O throughput score and average response time for your server and with individual scores for the client computers. You can use these scores to measure, analyze, and predict how well your server can handle file requests from clients.
To make sure of a fresh start, the data volumes should always be formatted between tests to flush and clean up the working set. For improved performance and scalability, we recommend that client data be partitioned over multiple data volumes. The networking, storage, and interrupt affinity sections contain additional tuning information that might apply to specific hardware.
Registry Tuning Parameters for Servers
The following registry tuning parameters can affect the performance of file servers:
NtfsDisable8dot3NameCreation
HKLM\System\CurrentControlSet\Control\FileSystem\ (REG_DWORD)
The default is 0. This parameter determines whether NTFS generates a short name in the 8.3 (MS-DOS) naming convention for long file names and for file names that contain characters from the extended character set. If the value of this entry is 0, files can have two names: the name that the user specifies and the short name that NTFS generates. If the name that the user specifies follows the 8.3 naming convention, NTFS does not generate a short name.
Changing this value does not change the contents of a file, but it avoids the short-name attribute creation for the file and also changes how NTFS displays and manages the file. For most file servers, the recommended setting is 1.
HKLM\System\CurrentControlSet\Services\LanmanServer
\Parameters\(REG_DWORD)
The default is 0. This parameter disables the processing of write flush commands from clients. If you set the value of this entry to 1, you can improve the server performance and client latency for power-protected servers.
Registry Tuning Parameters for Client Computers
The following registry tuning parameters can affect the performance of client computers:
HKLM\system\CurrentControlSet\Services\lanmanworkstation
\parameters\ (REG_DWORD)
Windows XP client computers only.
This parameter specifies the maximum number of files that should be left open on a share after the application has closed the file.
HKLM\system\CurrentControlSet\Services\lanmanworkstation
\parameters\ (REG_DWORD)
Windows XP client computers only.
This parameter is the number of seconds that the redirector waits before it starts scavenging dormant file handles (cached file handles that are currently not used by any application).
DisableByteRangeLockingOnReadOnlyFiles
HKLM\System\CurrentControlSet\Services\LanmanWorkStation
\Parameters\ (REG_DWORD)
Windows XP client computers only.
Some distributed applications lock parts of a read-only file as synchronization across clients. Such applications require that file-handle caching and collapsing behavior be off for all read-only files. This parameter can be set if such applications will not be run on the system and collapsing behavior can be enabled on the client computer.
Performance Tuning for File Server Workload (SPECsfs2008)
SPECsfs2008 is a file server benchmark suite from Standard Performance Evaluation Corporation that measures file server throughput and response time, providing a standardized method for comparing performance across different vendor platforms. SPECsfs2008 results summarize the server's capabilities with respect to the number of operations that can be handled per second, and the overall latency of the operations.
To ensure accurate results, you should format the data volumes between tests to flush and clean up the working set. For improved performance and scalability, we recommend that you partition client data over multiple data volumes. The networking, storage, and interrupt affinity sections of this paper contain additional tuning information that might apply to specific hardware.
Registry-Tuning Parameters for NFS Server
You can tune the following registry parameters to enhance the performance of NFS servers:
Parameter
|
Recommended Value
|
AdditionalDelayedWorkerThreads
|
16
|
NtfsDisable8dot3NameCreation
|
1
|
NtfsDisableLastAccessUpdate
|
1
|
DefaultNumberOfWorkerThreads
|
128
|
OptimalReads
|
1
|
RdWrHandleLifeTime
|
10
|
RdWrNfsHandleLifeTime
|
60
|
RdWrNfsReadHandlesLifeTime
|
10
|
RdWrThreadSleepTime
|
60
|
FileHandleCacheSizeinMB
|
1*1024*1024*1024 (1073741824)
|
LockFileHandleCacheInMemory
|
1
|
MaxIcbNfsReadHandlesCacheSize
|
30000
|
SecureHandleLevel
|
0
|
RdWrNfsDeferredWritesFlushDelay
|
60
|
CacheAddFromCreateAndMkDir
|
1
|
Performance Tuning for Network Workload (NTttcp) Tuning for NTttcp
NTttcp is a Winsock-based port of ttcp to Windows. It helps measure network driver performance and throughput on different network topologies and hardware setups. It provides the customer a multithreaded, asynchronous performance workload for measuring achievable data transfer rate on an existing network setup. For more information, see “Resources” later in this guide.
Options include the following:
A single thread should be sufficient for optimal throughput.
Multiple threads are required only for single to many clients.
Posting enough user receive buffers (by increasing the value passed to the -a option) reduces TCP copying.
You should not excessively post user receive buffers because the first ones that are posted would return before you need to use other buffers.
It is best to bind each set of threads to a processor (the second delimited parameter in the -m option).
Each thread creates a socket that connects (listens) on a different port.
Table 11. Example Syntax for NTttcp Sender and Receiver
Syntax
|
Details
|
Example Syntax for a Sender
NTttcps –m 1,0,10.1.2.3 –a 2
|
Single thread.
Bound to CPU 0.
Connecting to a computer that uses IP 10.1.2.3.
Posting two send-overlapped buffers.
Default buffer size: 64 K.
Default number of buffers to send: 20 K.
|
Example Syntax for a Receiver
NTttcpr –m 1,0,10.1.2.3 –a 6 –fr
|
Single thread.
Bound to CPU 0.
Binding on local computer to IP 10.1.2.3.
Posting six receive-overlapped buffers.
Default buffer size: 64 KB.
Default number of buffers to receive: 20 K.
Posting full-length (64-K) receive buffers.
| Network Adapter
Make sure that you enable all offloading features.
TCP/IP Window Size
For 1-GB adapters, the settings shown in Table 11 should provide you good throughput because NTttcp sets the default TCP window size to 64 K through a specific socket option (SO_RCVBUF) for the connection. This provides good performance on a low-latency network. In contrast, for high-latency networks or for 10-GB adapters, NTttcp’s default TCP window size value yields less than optimal performance. In both cases, you must adjust the TCP window size to allow for the larger bandwidth delay product. You can statically set the TCP window size to a large value by using the -rb option. This option disables TCP Window Auto-Tuning, and we recommend its use only if the user fully understands the resultant change in TCP/IP behavior. By default, the TCP window size is set at a sufficient value and adjusts only under heavy load or over high-latency links.
Windows Server 2008 R2 supports RSS out of the box. RSS enables multiple DPCs to be scheduled and executed on concurrent processors, which improves scalability and performance for receive-intensive scenarios that have fewer networking adapters than available processors. Note that, because of hardware limitations on some adapters and other functionality constraints, not all adapters can support concurrently processing DPCs on all processors on the server. DPCs are also not scheduled on hyperthreading processors because of an adverse effect on performance. Therefore, DPCs in RSS are scheduled only on logical and physical processors regardless of how many cores or sockets are on the server.
Tuning for IxChariot
IxChariot is a networking workload generator from Ixia. It stresses the network to help predict networked application performance.
You can use the High_Performance_Throughput script workload of IxChariot to simulate the NTttcp workload. The tuning considerations for this workload are the same as those for NTttcp.
For more information on IxChariot, see "Resources" later in this guide.
Performance Tuning for Remote Desktop Services Knowledge Worker Workload
Windows Server 2008 R2 Remote Desktop Services (RDS) capacity planning tools include automation framework and application scripting support that enable the simulation of user interaction with RDS. Be aware that the following tunings apply only for a synthetic RDS knowledge worker workload and are not intended as turnings for a server that is not running this workload. This workload is built with these tools to emulate common usage patterns for knowledge workers.
The RDS knowledge worker workload uses Microsoft Office applications and Microsoft Internet Explorer. It operates in an isolated local network that has the following infrastructure:
Domain controller (Active Directory, Domain Name System—DNS, and Dynamic Host Configuration Protocol —DHCP).
Microsoft Exchange Server for e-mail hosting.
IIS for Web hosting.
Load Generator (a test controller) for creating a distributed workload.
A pool of Windows XP–based test systems to execute the distributed workload, with no more than 60 simulated users for each physical test system.
RDS (Application Server) with Microsoft Office installed.
Note: The domain controller and the load generator could be combined on one physical system without degrading performance. Similarly, IIS and Exchange Server could be combined on another computer system.
Table 12 provides guidelines for achieving the best performance on the RDS workload and suggestions as to where bottlenecks might exist and how to avoid them.
Table 12. Hardware Recommendations for RDS Workload
Hardware limiting factor
|
Recommendation
|
Processor usage
|
Use 64-bit processors to expand the available virtual address space.
Use multicore systems (at least two or four sockets and dual-core or quad-core 64-bit CPUs).
|
Physical disks
|
Separate the operating system files, pagefile, and user profiles (user data) to individual physical partitions.
Choose the appropriate RAID configuration. (Refer to “Choosing the RAID Level” earlier in this guide.)
If applicable, set the write-through cache policy to 50% reads and 50% writes.
If applicable, select Enable write caching on the disk through the Microsoft Management Console (MMC) disk management snap-in (Diskmgmt.msc).
If applicable, select Enable Advanced Performance through the MMC disk management snap-in (Diskmgmt.msc).
|
Memory (RAM)
|
The amount of RAM and physical memory access times affect the response times for the user interactions. On NUMA-type computer systems, make sure that the hardware configuration uses the NUMA, which is changed by using system BIOS or hardware partitioning settings.
|
Network bandwidth
|
Allow enough bandwidth by using network adapters that have high bandwidths such as 1-GB Ethernet.
|
After you have installed the operating system and added the RDS role, apply the following changes:
Navigate to Control Panel > System > Advanced System Settings > Advanced tab and set the following:
Navigate to Performance Settings > Advanced > Virtual memory and set one or more fixed-size pagefiles (Initial Size equal to Maximum Size) with a total pagefile size at least two to three times the physical RAM size to minimize paging. For servers that have hundreds of gigabytes of memory, the complete elimination of the paging file is possible. Otherwise, the paging file might be limited because of constraints in available disk space. There are no clear benefits of a paging file larger than 100 GB. Make sure that no system-managed pagefiles are in the Virtual memory on the Application Server.
Navigate to Performance Settings > Visual Effects and select the Adjust for best performance check box.
Allow for the workload automation to run by opening the MMC snap-in for Group Policy (Gpedit.msc) and making the following changes to Local Computer Policy > User Configuration > Administrative Templates:
Navigate to Control Panel > Display, and disable Screen Saver and Password protected screen saver.
Under Start Menu and Taskbar, enable Force Windows Classic Start Menu.
Navigate to Windows Components > Internet Explorer, and enable Prevent Performance of First Run Customize settings and select Go directly to home page.
Navigate to Start > All Programs > Administrative Tools > System Configuration Tools tab, disable User Account Control (UAC) by selecting Disable UAC, and then reboot the system.
HKLM\SOFTWARE\Microsoft\Internet Explorer\Low Rights\ (REG_DWORD)
Minimize the effect on CPU usage when you are running many RDS sessions by opening the MMC snap-in for Group Policy (Gpedit.msc) and making the following changes under Local Computer Policy > User Configuration > Administrative Templates:
Under Start Menu and Taskbar, enable Do not keep history of recently opened documents.
Under Start Menu and Taskbar, enable Remove Balloon Tips on Start Menu items.
Under Start Menu and Taskbar, enable Remove frequent program list from Start Menu.
Minimize the effect on the memory footprint and reduce background activity by disabling certain Microsoft Win32® services. The following are examples from command-line scripts to do this:
Service name
|
Syntax to stop and disable service
|
Desktop Window Manager Session Manager
|
sc config UxSms start= disabled
sc stop UxSms
|
Windows Error Reporting service
|
sc config WerSvc start= disabled
sc stop WerSvc
|
Windows Update
|
sc config wuauserv start= disabled
sc stop wuauserv
|
Minimize background traffic by opting out of diagnostics feedback programs. Under Start > All Programs > Administrative Tools > Server Manager, go to Resources and Support:
Opt out of participating in the Customer Experience Improvement Program (CEIP).
Opt out of participating in Windows Error Reporting (WER).
Apply the following changes from the Remote Desktop Session Host Configuration MMC snap-in (Tsconfig.msc):
Set the maximum color depth to 24 bits per pixel (bpp).
Disable all device redirections.
Navigate to Start > All Programs > Administrative Tools > Remote Desktop Services > Remote Desktop Session Host Configuration and change the Client Settings from the RDP-Tcp properties as follows:
Limit the Maximum Color Depth to 24 bpps.
Disable redirection for all available devices such as Drive, Windows Printer, LPT Port, COM Port, Clipboard, Audio, Supported Plug and Play Devices, and Default to main client printer.
|