When you run a server system in your organization, you might have business needs that are not met by using the default settings. For example, you might need the lowest possible energy consumption, or the lowest possible latency, or the maximum possible throughput on your server. This guide describes how you can tune the server settings in Windows Server® 2012 and obtain incremental performance or energy efficiency gains, especially when the nature of the workload varies little over time.
To have the most impact, your tuning changes should consider the hardware, the workload, the power budgets, and the performance goals of your server. This guide describes important tuning considerations and settings that can result in improved performance or energy efficiency. This guide describes each setting and its potential effect to help you make an informed decision about its relevance to your system, workload, performance, and energy usage goals.
Since the release of Windows Server 2008, customers have become increasingly concerned about energy efficiency in the datacenter. To address this need, Microsoft® and its partners invested a large amount of engineering resources to develop and optimize the features, algorithms, and settings in Windows Server 2012 and Windows Server 2008 R2 to maximize energy efficiency with minimal effects on performance. This guide describes energy consumption considerations for servers and provides guidelines for meeting your energy usage goals. Although “power consumption” is a more commonly used term, “energy consumption” is more accurate because power is an instantaneous measurement (Energy = Power * Time). Power companies typically charge datacenters for both the energy consumed (megawatt-hours) and the peak power draw required (megawatts).
Note Registry settings and tuning parameters changed significantly from Windows Server 2003, Windows Server 2008, and Windows Server 2008 R2 to Windows Server 2012. Be sure to use the latest tuning guidelines to avoid unexpected results.
It is important to select the proper hardware to meet your expected performance and power goals. Hardware bottlenecks limit the effectiveness of software tuning. This section provides guidelines for hardware to provide a good foundation for the role that a server will play.
It is important to note that there is a tradeoff between power and performance when choosing hardware. For example, faster processors and more disks will yield better performance, but they can also consume more energy.
See Choosing Server Hardware: Power Considerations later in this guide for more details about these tradeoffs. Later sections of this guide provide tuning guidelines that are specific to a server role and include diagnostic techniques for isolating and identifying performance bottlenecks for certain server roles.
Choosing Server Hardware: Performance Considerations
Table 1 lists important items that you should consider when you choose server hardware. Following these guidelines can help remove performance bottlenecks that might impede the server’s performance.
Choose 64-bit processors for servers. 64-bit processors have significantly more address space, and are required for Windows Server 2012. No 32-bit editions of the operating system will be provided, but 32-bit applications will run on the 64-bit Windows Server 2012 operating system.
To increase the computing resources in a server, you can use a processor with higher-frequency cores, or you can increase the number of processor cores. If CPU is the limiting resource in the system, a core with 2x frequency typically provides a greater performance improvement than two cores with 1x frequency. Multiple cores are not expected to provide a perfect linear scaling, and the scaling factor can be even less if hyperthreading is enabled because hyperthreading relies on sharing resources of the same physical core.
It is important to match and scale the memory and I/O subsystem with the CPU performance and vice versa.
Do not compare CPU frequencies across manufacturers and generations of processors because the comparison can be a misleading indicator of speed.
Choose large L2 or L3 processor caches. The larger caches generally provide better performance, and they often play a bigger role than raw CPU frequency.
When your computer runs low on memory and it needs more immediately, modern operating systems use hard disk space to supplement system RAM through a procedure called paging. Too much paging degrades the overall system performance.
You can optimize paging by using the following guidelines for page file placement:
Isolate the page file on its own storage device(s), or at least make sure it doesn’t share the same storage devices as other frequently accessed files. For example, place the page file and operating system files on separate physical disk drives.
Place the page file on a drive that is not fault-tolerant. Note that, if the disk fails, a system crash is likely to occur. If you place the page file on a fault-tolerant drive, remember that fault-tolerant systems are often slower to write data because they write data to multiple locations.
Use multiple disks or a disk array if you need additional disk bandwidth for paging. Do not place multiple page files on different partitions of the same physical disk drive.
In Windows Server 2012, it is highly recommended that the primary storage and network interfaces are PCI Express (PCIe), and that servers with PCIe buses are chosen. Also, to avoid bus speed limitations, use PCIe x8 and higher slots for 10 Gigabit Ethernet adapters.
Choose disks with higher rotational speeds to reduce random request service times (2 ms on average when you compare 7,200- and 15,000-RPM drives) and to increase sequential request bandwidth. However, there are cost, power, and other considerations associated with disks that have high rotational speeds.
2.5-inch enterprise-class disks can service a significantly larger number of random requests per second compared to equivalent 3.5-inch drives.
Store frequently accessed data (especially sequentially accessed data) near the “beginning” of a disk because this roughly corresponds to the outermost (fastest) tracks.
Be aware that consolidating small drives into fewer high-capacity drives can reduce overall storage performance. Fewer spindles mean reduced request service concurrency; and therefore, potentially lower throughput and longer response times (depending on the workload intensity).
Table 2 lists the recommended characteristics for network and storage adapters for high-performance servers. These settings can help prevent your networking or storage hardware from being a bottleneck when they are under heavy load.
Table 2. Networking and Storage Adapter Recommendations
The adapter has passed the Windows® Hardware Quality Labs (WHQL) certification test suite.
Adapters that are 64-bit-capable can perform direct memory access (DMA) operations to and from high physical memory locations (greater than 4 GB). If the driver does not support DMA greater than 4 GB, the system double-buffers the I/O to a physical address space of less than 4 GB.
Copper and fiber (glass) adapters
Copper adapters generally have the same performance as their fiber counterparts, and both copper and fiber are available on some Fibre Channel adapters. Certain environments are better suited to copper adapters, whereas other environments are better suited to fiber adapters.
Dual- or quad-port adapters
Multiport adapters are useful for servers that have a limited number of PCI slots.
To address SCSI limitations on the number of disks that can be connected to a SCSI bus, some adapters provide two or four SCSI buses on a single adapter card. Fibre Channel disks generally have no limits to the number of disks that are connected to an adapter unless they are hidden behind a SCSI interface.
Serial Attached SCSI (SAS) and Serial ATA (SATA) adapters also have a limited number of connections because of the serial nature of the protocols, but you can attach more disks by using switches.
Network adapters have this feature for load-balancing or failover scenarios. Using two single-port network adapters usually yields better performance than using a single dual-port network adapter for the same workload.
PCI bus limitation can be a major factor in limiting performance for multiport adapters. Therefore, it is important to consider placing them in a high-performing PCIe slot that provides enough bandwidth.
Some adapters can moderate how frequently they interrupt the host processors to indicate activity or its completion. Moderating interrupts can often result in reduced CPU load on the host, but unless interrupt moderation is performed intelligently, the CPU savings might increase latency.
Receive Side Scaling (RSS) support
RSS is a technology that enables packet receive-processing to scale with the number of available computer processors. Particularly important with faster Ethernet (10 GB or more).
Offload capability and other advanced features such as message-signaled interrupt (MSI)-X
Offload-capable adapters offer CPU savings that yield improved performance. For more information, see Choosing a Network Adapter later in this guide.
Dynamic interrupt and deferred procedure call (DPC) redirection
Windows Server 2012 has functionality that enables PCIe storage adapters to dynamically redirect interrupts and DPCs. This capability, originally called “NUMA I/O,” can help any multiprocessor system by improving workload partitioning, cache hit rates, and on-board hardware interconnect usage for I/O-intensive workloads.