As the business world shifts from host-based computing to distributed computing, a robust operating system on which to host mission-critical line of business applications is essential. The core operating system infrastructure must provide a robust platform with high availability and scalability to take advantage of the latest developments in server hardware. Essential features for an enterprise-level application server platform include:
Memory management including memory protection, virtual memory support, and support for the latest standards to address large amounts of physical memory such as the Intel Page Size Extension 36 or Compaq Alpha Very Large Memory standards.
Process scheduling to allow mission-critical applications to receive CPU priority.
Symmetric Multiprocessor (SMP) Support should be provided to support up to 32 CPUs, which is the current industry standard.
Availability features including fail-over clustering and TCP/IP load balancing services.
High performance optimized operating system kernel.
Solaris 7 Implementation Details
The Solaris 7 Operating System was designed to take full advantage of the 64-bit UltraSPARC processor architecture. Solaris for x86, unfortunately, does not take full advantage of all the available features of the Intel architecture, such as PSE36 support. It is a safe assumption that when the 64-bit Intel processors ship, Solaris will be available on that hardware platform.
Virtual Memory support is incorporated into the memory management system in Solaris 7. This ensures the most cost-effective use of physical memory possible. Solaris swaps infrequently accessed code areas to the hard disk, freeing physical memory space for caching and for more active code segments. This allows applications to address a memory space that is larger than the physical RAM in the server. Virtual memory is implemented transparently to the user and developer and grows and shrinks dynamically within the space allocated by the server administrator.
Addressable Memory support in Solaris 7 has been enhanced, allowing it to directly address up to 64GB of physical RAM. However, no support has been added to the x86 version of Solaris 7 to take advantage of the additional memory on Intel Xeon processors through PSE36.
The administrator is allowed to specify the amount of system resources, such as processor time, available to each application, and to prioritize them. This permits tuning for optimum responsiveness for critical applications that are competing for processor resources. Processor affinity is also supported.
SMP support in Solaris on UltraSPARC is capable of handling up to 64 processors. Support for Solaris 7 on x86 is limited to 32-processor hardware.
High-availability features such as clustering and load balancing are supported in the Solaris Enterprise Server extensions. They are not available in the entry-level Solaris 7 product. Other availability features include dynamic reconfiguration, hot-plug swap, and online upgrade capabilities.
Dynamic reconfiguration allows systems to continue running when system boards fail. System boards can then be replaced while the system is running.
Hot-plug swap enables administrators to add or remove subsystems and cards while the system is running. This capability provides strong management and control features for reconfiguring operational systems.
Online upgrade makes it possible to upgrade the operating system while the system is running. During the upgrade, the server remains operational and a reboot is not required.
Solaris Enterprise Server supports 4-node clusters and load balancing. For cluster management, Solaris Enterprise Server uses the Sun Cluster software. Sun Cluster provides an extensive GUI tool for ensuring the reliability, availability and serviceability of servers. For load balancing, Solaris Enterprise Server uses Solaris Resource Manager. Resource Manager can dynamically allocate unused resource capacity to improve application performance. Resource policies configured through the management tool also make it possible to rigidly control resource usage. Sun Cluster and Resource Manager are also available as separate products.
Finally, Java Virtual Machine support is built directly into the Solaris 7 kernel. This provides a native ability to run Java applications on the Solaris 7 operating system platform. The Solaris Java VM provides complete compliance with all of the JDK 1.1 application programming interfaces (APIs) and supports Java database connectivity (JDBC), Java Naming and Directory Interface (JNDI), extended RMI functionality, and just-in-time (JIT) compilers. Excellent performance is available for Java applications and the easy ability to recompile the Solaris kernel removes most of the objections to necessity of making kernel changes when the Java standard is updated.
Windows NT Server 4.0 Implementation Details
Windows NT Server 4.0 was designed from the outset to be a robust, scalable platform for application services. To accomplish this design goal, an extremely sophisticated infrastructure has been developed to provide multiprocessor, advanced memory management, availability, and load balancing services within the Windows NT operating system.
Windows NT Server has provided advanced memory management features since its initial release. These include memory protection and support for virtual memory. With the release of Windows NT Server 4.0 Enterprise Edition, the addition of 3GB tuning was added.
3GB Tuning – Windows NT Server 4.0 Enterprise Edition presents an enhanced memory management implementation over that in prior versions of Windows NT. The standard edition of Windows NT Server 4.0 and prior versions provide a virtual 2-GB address space to every application. An additional 2 GB (for a 4-GB total between system and applications) is reserved by the operating system itself. Windows NT Server 4.0 Enterprise Edition extends this capability by allowing large memory-aware applications to use up to 3 GB, reserving only 1 GB for the operating system.
Addressable Memory – Windows NT Server 4.0 Enterprise systems running on Compaq Alpha processors can also take advantage of the underlying 64-bit architecture of the Alpha CPU. For example, Oracle Very Large Memory (VLM) for Windows NT Server on Alpha allows applications to benefit from the memory addressing capabilities of today’s 64-bit Alpha processors. With VLM, Oracle can use up to 8 GB of physical memory in the Compaq AlphaServer 4100 and up to 28 GB of physical memory in the AlphaServer 8200 and 8400. Additionally, on Intel Pentium II Xeon based server systems with the Intel PSE 36 driver and a supported chipset, Windows NT Server 4.0 Enterprise Edition can access 36 GB of RAM for its applications.
Windows NT Server 4.0 provides minimal process scheduling support by allowing various applications and services to be assigned CPU priorities by the user. Additionally, at the highest level, two performance configuration options can be set to optimize the operating system as a whole for either workstation (foreground application services) or server (background application services) applications.
Windows NT Server provides support for machines conforming to the Symmetric multiprocessor (SMP) standard for multiple CPUs. It supports multiple CPUs on both the Intel x86 and Compaq Alpha processor families. In Windows NT Server 4.0 Enterprise Edition, 8 CPUs are supported out-of-the-box. Up to 32 CPUs are supported in custom configurations available from leading hardware OEMs.
Microsoft Clustering services ship as a standard feature with Windows NT Server 4.0 Enterprise Edition to provide clustering system services to guarantee high levels of application and data availability. Microsoft Cluster Server allows two servers to be connected into a cluster for higher availability and easier manageability of server resources. The two servers do not have to be the same size or configuration.
Microsoft Clustering services monitor the health of standard applications and servers, and automatically recover mission-critical data and applications from many common types of failure, usually in under a minute. Alternatively, system administrators can use the cluster service administration console to move workloads around within the cluster to balance processing loads or to unload servers for planned maintenance or testing without taking important data and applications offline for any significant period of time.
Out of the box, Microsoft Clustering server can provide high availability for most popular Windows NT services including file shares, print queues, Internet Information Server, Microsoft Transaction Server, and Microsoft Message Queue Server. Other popular applications supported by Microsoft Clustering server include Microsoft SQL Server 6.5 Enterprise Edition, Microsoft Exchange Server 5.5, and many popular products from Computer Associates, Cheyenne, Baan, Hewlett Packard, IBM, NetIQ, Octopus, SAP, and Vinca.
Network Load Balancing Service (NLBS) was recently introduced as a free upgrade to customers of Windows NT Server 4.0 Enterprise Edition. Microsoft Cluster Server is primarily aimed at providing availability for back-end applications and data services processing for such applications as Microsoft SQL Server. NLBS compliments Microsoft Cluster Server by providing availability and load balancing services to the front-end layer.
NLBS installs as a standard Windows NT network driver. Once installed, it operates in a transparent manner both to the TCP/IP server applications, such as Internet Information Server, and to TCP/IP clients on the network. Clients can access a NLBS cluster, which scales up to 32 nodes, via a single IP address. Under normal operations, NLBS automatically balances the networking traffic between the clustered computers, scaling the performance of one server to the level required by the administrator. When a computer fails or goes offline, NLBS automatically reconfigures the cluster to direct the client connections to the remaining computers. The offline system can then transparently rejoin the cluster and regain its share of the workload when it returns to service.
Windows 2000 Server Implementation
Windows 2000 Server uses the core application services infrastructure in Windows NT Server 4.0 as the basis of its core operating system implementation. Numerous improvements have pushed Windows 2000 Server to the leading edge in terms of scalability and availability.
The memory management infrastructure of Windows 2000 Server is greatly improved with the introduction of the Enterprise Memory Architecture (EMA). EMA allows Windows 2000 Advanced Server to take advantage of physical memories larger than 4 GB. Applications that are “large memory aware” can use addresses above 4GB to cache data in memory, resulting in extremely high performance.
Intel Pentium III Xeon microprocessors feature their own standards to take advantage of large physical memory arrays. Windows 2000 Server supports the Compaq Alpha Very Large Memory (VLM) and Intel Page Size Extension (PSE36). Windows 2000 Advanced Server machines can address as much as 64 GB of physical memory.
Windows 2000 Advanced Server continues to support 8 CPUs in the standard retail kit and 32 CPUs under the OEM terms and conditions of sale. This level of SMP support is unchanged from that of Windows NT Server 4.0 Enterprise Edition, but features a highly superior implementation of the SMP cod. This allows for better linearity of scaling on high performance. Performance will be most improved on systems with 8 CPUs or more.
More performance optimizations have been made in Windows 2000 Server to improve CPU, memory, and I/O performance including:
New Winsock driver (AFD) that provides the capability to complete large TransmitFile operations in an APC of the transmitting thread rather than posting it to a delayed system worker thread.
Enhanced memory allocation – the per-processor look-aside lists reduce shared memory access of global look-aside list headers leading to a speedup of five percent for disk I/O.
Additional non-paged and paged pool lists reduce pool fragmentation.
Reduced hold time for the dispatch lock on workloads such as the TPC-C (such as Microsoft SQL Server 7.0 with fibers) reduces contention on key system resources by up to 30 percent.
Fibers in Microsoft SQL Server 7.0 reduce context switches and improve throughput by up to 18 percent compared with threads.
Increased maximum working setfor the file system cache from 512 MB in Windows NT Server 4.0 to 960 MB in Windows 2000 Server. This reduces contention on system resources, thereby reducing context switching. On a SpecWeb ’96 workload, this improves throughput by up to 5 percent.
Per-processor completion ports reduces CPU migration of threads, which provides a 5 percent to 7 percent increase in TPC-C throughput with Microsoft SQL Server 6.5.
NTFS improvements reduce the number of operations posted to system worker threads, reducing context switching and thread CPU migrations by 46 percent and improving dual processor throughput on NetBench 5.0 by up to 3 percent.
Increased use of shared locks for NTFS TransactionTable executive resource reduces contention on this resource by up to 14 percent for a multi-processor file-server workload.
Reduced SCSI miniport controller contention (by a magnitude of seven) on Windows 2000 Server compared to Windows NT Server 4.0 on a TPC-C workload with Microsoft SQL Server 7.0.
Interrupt affinity provides an improvement of up to 7 percent on a SpecWeb ’96 workload on a four-processor system using four NICs.
Reduction of TCP/IP contention is expected to improve four-processor SpecWeb ’96 scaling by up to 20 percent.
Support for I2O hardware has been added. With I2O, several significant performance enhancing benefits can be achieved. The most significant is the offloading of certain I/O operations to intelligent storage adapters, resulting in more available CPU cycles to process complex calculations and lower overall CPU usage.
Other Core Enhancements
Additionally, many enhancements to the applications server infrastructure have been made in Windows 2000 Server including:
Job Objects are new kernel objects that can be named and secured. They are used to collect a group of related processes, enabling management and tracking of the process group. The Job Object enforces job quotas and security context. This enables the monitoring and control of per-process CPU time, per-job CPU time, minimum and maximum working set (memory usage), active process count, CPU affinity (which CPUs in a multi-processor system can run the processes) and priority class.
Scatter/Gather I/O support enables higher I/O throughput when application data is located in non-contiguous memory locations (which is typical) and data needs to be written to a contiguous file location. It is VLM-enabled on the Alpha platform in Windows 2000 Server. The WriteFileGather API takes pointers to one or more pages in memory, gathers them together, and writes them out to the file as one chunk. ReadFileScatter reads in one or more pages from the file system and scatters them to pre-established buffers. The advantage of this technique is that the program need not work with intermediate buffers that contain the data as a single logical chunk.
Spin Count is a new feature introduced to deal with the bottlenecks created by a specific memory block or resource that is constantly being acquired or released by a particular process. Usually a critical section guards these resources. However, if a thread blocks on a critical section, it is calling WaitForSingleObject(), which is relatively expensive. If the critical section for a resource is usually acquired and released in a fairly short time period, the critical section can be optimized so that threads will not spend as much time in an expensive WaitForSingleObject() call. The dwReserved field of a critical section is now used for a spin count. If the spin count is set, a thread that would normally block while waiting for a critical section will instead enter a loop, where it continually checks to see if the critical section can be acquired. If the loop executes “spin count” a number of times, the thread gives up and reverts back to the old behavior by calling WaitForSingleObject(). The goal is that the blocking processor should acquire the critical section faster by this method than by using WaitForSingleObject(). It should be noted that spin counts do not result in any performance gain on single processor systems, but that the APIs can be called with no ill effects.
High Performance Sorting support has been introduced in Windows 2000 Advanced Server. Specifically, commercial sorting performance of large data sets have been optimized, improving the performance of common tasks such as preparing to load data in batch for data warehouse/data mart operations and to prepare large sort-sensitive print and batch operations.
The Microsoft Cluster Server implementation in Windows NT Server 4.0 Enterprise Edition and the Windows Load Balancing Service have been enhanced and made standard features as part of the Windows 2000 Advanced Server package. Improvements can be summarized as follows:
Rolling Upgrades allow administrators to easily take a server that is a cluster member offline for maintenance, permitting ”rolling upgrades” of system and application software. There are two major advantages to a rolling upgrade. First, service outages are very short during the upgrade process. Second, the cluster configuration does not have to be recreated – the configuration will remain intact during the upgrade process.
Active Directory and MMC Integration has been added to the Clustering Service for Windows 2000 Advanced Server. The Active Directory service is automatically used to publish information about clusters. All management is now accomplished with the MMC, making setup easier and allowing administrators to visually monitor the status of all resources in the cluster.
Recovery from Network Failure support has been added to the Clustering Service in Windows 2000 Advanced Server. A sophisticated algorithm has been included to detect and isolate network failures and improve failure recovery actions. It can detect a number of different states for network failures and then use the appropriate failover policy to determine whether or not to failover the resource group.
Plug and Play Support can now be used by the Clustering Service to automatically detect the addition and removal of network adapters, TCP/IP network stacks, and shared physical disks, expediting configuration.
WINS, DFS, and DHCPSupport has been added to the clustering service for automatic failover and recovery. A File Share resource can now serve as a Distributed File System (DFS) root or it can share its folder subdirectories for efficient management of large numbers of related file shares. This provides a highly increased level of availability for mission-critical network services over prior versions of Windows NT.
COM Support for the Cluster APIhas been added, providing a standard, cross-platform API set for developing and supporting cluster-aware applications. This API can be used to create scalable, cluster-capable applications that can automatically balance loads across multiple servers within the cluster. Additionally, the Windows Script Host (WSH) can control cluster behavior and automate many cluster administration tasks.
Core Application Infrastructure Services Summary
Windows 2000 Server provides a solid foundation architecture for building and running server-based applications. It provides customers with clustering services, resulting in higher application availability. Windows 2000 Server offers support integrated into the operating system for the Compaq Alpha Very Large Memory (VLM) and Intel Page Size Extension (PSE36) standards for addressing extremely large amounts of physical memory. Additionally, Windows 2000 Server provides support for I2O, increasing the performance of disk-intensive applications on I2O-capable systems. Finally, Windows 2000 Server provides mature integrated kernel optimizations such as SMP, memory protection and process management to increase the performance and robustness of enterprise applications.
Windows NT Server 4.0 provides customers with a solid foundation for running server-based applications. It suffers in comparison to Windows 2000 Server by lacking native support for the Intel PSE36 standard. Its clustering services implementation is an entire generation behind that of Windows 2000 Server and does not contain all of the numerous enhancements present in the Windows 2000 Server implementation.
Solaris 7 provides an excellent foundation on which to build server-based applications. On SPARC hardware Solaris 7 provides support for directly addressing 64 GB of physical memory (only 4 GB for Solaris on x86). As a 64-bit operating system (32-bit on x86) Solaris 7 provides a foundation for highly scalable high-performance application servers.