Windows 2000 Datacenter Server is the most powerful server operating system ever offered by Microsoft. It is designed for enterprises that demand the highest levels of availability and scale.
Windows 2000 Datacenter Server expands the SMP and clustering features in Windows 2000 Advanced Server and includes new features to maximize reliability and availability. Datacenter Server is designed to meet the needs of online transaction processing (OLTP), large data warehouses, econometric analysis, and server consolidation.
Maximizing Availability: 32 SMP and 4-Node Clustering
Windows 2000 Datacenter Server scales up to 32-way symmetric multiprocessing (SMP) and up to 64 gigabytes (GB) of physical memory, compared with up to 8way SMP and 8GB of memory in Windows 2000 Advanced Server. By increasing the amount of work a server can handle, this allows network administrators to take maximum advantage of Network Load Balancing (NLB) capability. In addition, failover support is increased in Windows 2000 Datacenter Server to support four nodes, compared with two nodes in Windows 2000 Advanced Server.
High Performance with WinSock Direct
In order to exploit the performance benefits of system area networks (SANs), Windows 2000 Datacenter Server includes WinSock Direct, which can be used instead of TCP/IP to streamline communication between hardware and application components distributed within a SAN.
A SAN is a particular class of network architecture that uses high-performance interconnections between secure servers to deliver reliable, high-bandwidth, low-overhead, and low-latency inter-process communications, usually within an IP subnet. SANs use switches to route data, with a typical hub supporting eight or more nodes and expanded to larger networks using cascading hubs. Cable length limitations range from a few meters to a few kilometers.
Compared to a standard TCP/IP protocol stack on a local area network (LAN) of comparable line speed, deploying WinSock Direct enables efficient high-bandwidth, low-latency messaging that conserves processor time for application use. High-bandwidth and low-latency inter-process communication (IPC) and network system I/O allow more users on the system and provide faster response times and higher transaction rates.
WinSock Direct makes thousands of existing applications transparently SAN-enabled. As a result, the growth of SAN-based architectures in business-critical environments is expected to accelerate. Now developers of SAN interconnect hardware can develop interconnects that are compatible with WinSock Direct by using the WinSock Direct SAN infrastructure built in to Windows 2000 Datacenter Server.
Managing Critical Resources: The Process Control Tool
Process Control is a powerful, flexible tool that helps you manage and control the resources that processors use on your system by applying rules that you define. Process Control uses a new kernel object called the Job Object that can be named and secured. It is used to collect a group of related processes so they can be tracked and managed as a single unit.
Process Control allows administrators to use Job Objects to customize an application's maximum memory use, application priority, application processor affinity, and various other limits. When adjusted to fit the design of an application (placing limits only where an application is designed to handle such limits), Process Control helps ensure predictable and stable operations. For example, one of the ways you can use this feature is to create rules to prevent processes from consuming excessive memory or CPU time (sometimes called runaway processes.)
To learn more about Windows 2000 Datacenter Server, visit www.microsoft.com/windows2000/guide/datacenter/overview/default.asp.
Services and Support Programs
Maintaining optimum reliability and availability requires access to support professionals and programs specifically tailored for demanding business requirements. The major Microsoft support options for businesses include:
Microsoft Alliance Support. This helps very large enterprise customers develop, deploy, and manage enterprise systems built around Microsoft products. Alliance Support is available under two programs:
Microsoft Alliance Support for Enterprise Systems provides the highest level of service available from Microsoft, including personnel dedicated to the organization, the creation and management of exclusive information resources, and executive-level contact between the customer and Microsoft. For more information, see the complete fact sheet at http://support.microsoft.com/directory/factsheets/allenter.doc.
Microsoft Alliance Support for High Availability provides a fully personalized service that focuses on Microsoft products as well as the environment in which they are deployed and the systems and operational processes by which they are managed. Microsoft and industry-leading service providers each deploy their most skilled support professionals for this offering. This provides a single source of support for a complete IT environment built around Microsoft products and technologies. For more information, see the complete fact sheet at http://support.microsoft.com/directory/factsheets/allhigh.doc.
In addition to these support programs, Microsoft offers a range of support offerings suitable for businesses of all sizes. To locate the right support program for your organization, see the Microsoft support options listed at http://support.microsoft.com/directory/overview.asp?sd=gn.
Microsoft Certified Support Centers
Microsoft Certified Support Centers (MCSCs) are industry leading, multi-vendor support providers that have a strategic relationship with Microsoft to ensure they deliver high quality technical support for Microsoft products. All MCSCs offer significant industry expertise in many types of environments and can provide your organization with a broad range services. For more information on the support options available, see the Microsoft Certified Support Centers home page at http://www.microsoft.com/support/mcsc/.
For a complete summary of support options, see the Support Options Overview page at http://support.microsoft.com/directory/overview.asp.
Windows Datacenter Program
The Windows Datacenter Program provides customers with an integrated hardware, software, and service offering—all delivered by Microsoft and authorized server vendors (OEMs). The program consists of three components:
OEM/Microsoft Jointly Staffed Support Queue
Hardware Compatibility Test and List
Software Maintenance
OEM/Microsoft Jointly Staffed Support Queue
Also known as the Microsoft Certified Support Center (MCSC) for Datacenter, this program tightly links Microsoft and OEM technical and support resources to help customers achieve the highest levels of availability. The jointly staffed support queue helps partners and Microsoft jointly deliver the service required for high-end environments using Windows 2000 Datacenter, including:
Training and information services, such as advanced new product training; access to internships and special partner development programs at Microsoft; a partner-level knowledge base of known issues and resolution; early notification of critical problems and fixes; and, regular technical bulletins of support information.
Software support services, including a joint team of Microsoft and partner support professionals to provide a single point of contact for customers; rapid escalation of critical or complex issues to Microsoft development for fixes; tools for managing hotfixes; and onsite critical problem support for customers.
A source code license to help in isolating and diagnosing system problems.
Business development services, including brand marketing, targeted joint marketing, customer satisfaction measurement, and participation in ongoing service development.
Account management services, including a dedicated account manager, annual business planning assistance, and ongoing advocacy activities within Microsoft.
To be designated as an MCSC Datacenter partner, an organization must meet a series of qualifications as a service provider. Those qualifications include:
Quality: consistent achievement of target customer satisfaction levels for support services provided to end customers and ongoing quality analysis and improvement methodologies.
Staffing and certification: requirements for the number of full-time professionals that support Microsoft products and Microsoft certifications.
Escalation: maximum rates for escalation of non-bug incidents to Microsoft and the ability to share support cases across partner and Microsoft tracking systems.
Problem replication environments: lab and replication environments capable of reproducing all Datacenter HCL systems for troubleshooting customer problems and testing software patches.
IHV/ISV Escalation Path: 24 x 7 access to an escalation path to debug independent hardware and software vendor resources and symbols files (needed for debugging) for all products certified as a part of the Datacenter system.
Service offerings: the capability to offer service components including:
A minimum uptime guarantee of 99.9 percent availability.
Installation and configuration services.
Availability assessments.
24 x 7 hardware and software support.
Response service for onsite hardware and software support.
Change management service.
Hardware Compatibility Test and List
OEM products must pass a special Hardware Compatibility Test conducted by the Windows Hardware Quality Labs (WHQL) verifying that the hardware and software interacts efficiently and optimally with Microsoft products.
If successful, these products are placed on the Hardware Compatibility List (HCL), and receive the “Designed for Windows” logo, which lets customers know the products meet Microsoft standards for compatibility with Windows operating systems.
Hardware intended for use with Windows 2000 Datacenter Server must also be designed to the specifications of the “Hardware Design Guide Version 2.0 for Microsoft Windows NT Server” at http://msdn.microsoft.com/library/books/serverdg/hardwaredesignguideversion20formicrosoftwindowsntserver.htm, and the companion “Server Design FAQ” at http://www.microsoft.com/HWDEV/xpapers/SDG2FAQ/FAQ1.htm.
A Windows 2000 Datacenter server must comply with all the required specifications included in the design guide. In addition, all Windows 2000 Datacenter servers must be capable of using eight processors or more, although they can ship with fewer than eight processors.
Windows 2000 Datacenter Server will be provided only by OEMs who are willing to do extra testing and configuration control, and who can provide comprehensive customer support programs. The testing that OEMs must do ensures the customer that the following components will work together smoothly on servers running Windows 2000 Datacenter Server:
All hardware components.
All hardware drivers.
All software that works at the kernel level, including virus software, disk and tape management, backup software, and similar types of software.
Requiring a 14-day Test Period
As part of the certification process, Microsoft is requiring a 14-day test period to prove that servers running Windows 2000 Datacenter Server can meet or exceed 99.9 percent availability. Microsoft established the 14-day test based on empirical studies of failures in Windows NT and Windows 2000. To achieve 99.9 percent availability, therefore, a Windows 2000 Datacenter Server must have a mean time between failures (MTBF), under normal customer load, of 13.875 days. Microsoft designed the Windows 2000 Datacenter Server test to be three times normal customer load; this means that the MTBF under test load must meet or exceed 4.625 days. (Extensive reliability research has shown that the MTBF is directly related to execution time, not calendar time; therefore, increasing the load can accelerate the test.) Therefore, the Datacenter tests were statistically designed to prove that the server can meet or exceed 99.9 percent reliability.
Ongoing Testing Requirements
Windows 2000 Datacenter–based servers are required to resubmit configuration files and test results for each Microsoft Windows Service Pack or any driver service changes provided by the vendor. When the new Windows Datacenter Program configuration is available, the previous configuration remains valid. Upgrading to a new configuration and Service Pack should be done after the customer has reviewed their requirements and system availability with their system partners.
Given these stringent testing requirements, customers who receive servers validated by the Windows Datacenter Program know that they are receiving a complete configuration that has been rigorously tested with all hardware components and kernel-level software products.
Datacenter Planning and Operations
The key to installing and maintaining highly reliable Windows 2000 Datacenter Server-based systems is detailed initial planning, followed by sound operating procedures and change control. Before installing a Windows 2000 Datacenter Server you and your vendor should do the following:
Identify workloads and servers you are going to run with Windows 2000 Datacenter Servers.
Determine the specific hardware configuration for these Windows 2000 Datacenter Servers including all required adaptors.
Identify all the installed non-Microsoft kernel drivers required for these systems.
Work with your system supplier to create a Windows Datacenter Program configuration.
Identify your Quick Fix Engineering (QFE) and Service Pack plans and policies.
Ensure that your change control and operation procedures for maintaining Windows Datacenter Program configurations are in place.
After identifying the configuration you require, you can work with your system supplier to receive a Windows Datacenter Program configuration. Windows Datacenter Program configuration files are available on the WHQL site of Microsoft.com at http://www.microsoft.com/hwtest/default.asp or your system supplier and can be downloaded to check your systems.
Windows Datacenter Program Servers
At a minimum, servers running Windows 2000 Datacenter Server must contain the following hardware or features:
Pentinum III Xeon Processors
Intelligent RAID storage subsystem.
512K L2 cache or equivalent memory for single processor systems; 256K L2 cache per processor minimum of 2P and greater systems.
CPUs expandable to at least eight processors.
Minimum 2 GB system memory, expandable to 4 GB.
System memory includes ECC memory protection.
Supports 64-bit bus architecture including 64‑bit physical address space, 64‑bit PCI adapters must be able to address any location in the address space supported by the platform and 64-bit processors.
SCSI host controller or fiber channel adaptor.
Power supply protection using N+1 (extra unit).
Support for power supply replacement.
Local hot-swap power supply replacement indicators.
Support for fan replacement.
Support for multiple hard drives.
RAID subsystem supports automatic replacement of failed drive.
RAID subsystem supports manual replacement of failed drive.
Support for at least one of RAID 1, 5, or 10.
Alert indicators for imminence of failure.
Alert indicators for occurrence of failure.
For more information about Windows Hardware Quality Labs and the Hardware Compatibility Test, see “The Windows Datacenter Program: Ensuring Hardware Quality” at http://www.microsoft.com/windows2000/guide/datacenter/hcl/dchclprogram.asp.
Software Maintenance
Customers of Windows 2000 Datacenter Server can choose to receive update subscriptions for the operating system from the OEM. The update subscriptions provide access to version releases, supplements, and Service Packs for Datacenter Server. The subscription is available on a monthly or yearly basis, and a customer must continue to renew the subscription with the OEM to obtain the benefits of the subscription.
People and Processes
Microsoft Operations Framework: Roadmap for Reliability
Clearly, a reliable computer operating system is a good start in a company's efforts to provide reliable computer services. But reliability depends a great deal on external factors. If someone forgets to perform an essential process, such as a routine backup, the consequences can mean increased downtime. Since everyone makes mistakes, it’s not terribly surprising that industry studies show that as much as 80 percent of system failures can be traced to errors caused by people or processes.
To help build operational processes that can reduce the impact of human error and eliminate ineffective processes, Microsoft built the Microsoft Operations Framework (MOF). Based on best practices that have been learned by enterprises over time, MOF provides technical guidance for achieving the highest levels of system reliability, availability, and manageability using Microsoft products and technologies.
Building on Standardized Best Practices
Industry best practices for IT service management are well documented within the Central Computer and Telecommunications Agency’s (CCTA) IT Infrastructure Library (ITIL).
The CCTA is a United Kingdom government executive agency chartered with development of best practice advice and guidance on the use of information technology in service management and operations. To accomplish this, the CCTA charters projects with leading information technology companies from around the world to document and validate best practices in the disciplines of IT service management.
MOF combines these collaborative industry standards with specific guidelines for using Microsoft products and technologies. MOF also extends ITIL code of practice to support distributed IT environments and current industry trends such as application hosting and Web-based transactional and e-commerce systems. The rest of this section introduces MOF at a high level so you can visualize how you can use these tools to help ensure system reliability.
Enterprise Services Frameworks
MOF is one of the three frameworks that form the Enterprise Services Frameworks (ESF). The other two ESF frameworks are Microsoft Readiness Framework (MRF) and the Microsoft Solutions Framework (MSF). Figure 5 below shows how each of the frameworks fits into ESF
Each ESF framework targets a different, but integral, phase in the information technology (IT) life cycle, and provides detailed information about the people, processes, and technologies required to successfully execute that phase of the cycle.
Figure 5. Enterprise Services Frameworks
The Microsoft Operations Framework provides operational guidance in the form of white papers, operations guides, assessment tools, operations kits, best practices, case studies, and support tools. These materials address the people, process, and technologies required for effectively managing production systems within a complex distributed IT environment. For more information on Microsoft's enterprise frameworks and offerings, see:
Microsoft Solutions Framework home page at http://www.microsoft.com/msf.
Microsoft Operations Framework white papers at http://www.microsoft.com/trainingandservices/MOFoverview.
MOF addresses the constant change typically experienced in distributed IT environments and helps guide IT staff through change with the least possible disruption to ongoing service. This framework consists of six fundamental principles. Table 1 below lists these principles and how MOF uses them.
Table 1. Microsoft Operations Framework Principles
Principle
|
Description
|
IT/ business alignment
|
Design IT services to meet business goals and priorities.
|
Customer focused
|
Use service level agreements (SLAs) to manage the quality of customer services.
|
Spiral life cycle
|
Continuously assess and adapt operations services.
|
Team of peers
|
Organize the communication, skills, roles, and responsibilities of a highly competent and flexible operations staffing model.
|
Best practices
|
Leverage industry and Microsoft best practices.
|
Measurement
|
Develop and use tools to measure operations activities.
| The MOF Process Model
Defining any high-level process model requires a compromise that balances simplicity and understanding with scientific accuracy. IT operations represent a complex set of dynamics. With so many processes, procedures, and communications happening simultaneously across a diverse set of systems, applications, and platforms, it is virtually impossible to model a live system exactly.
As a result, MOF’s approach is to simplify this complex set of dynamics into a framework that is easy to understand and whose principles and practices are easy to incorporate and apply. The power of this simplified approach will enable the operations staff with varying levels of experience, in an enterprise of any size, to realize tangible benefits to the existing, or proposed, operations.
The MOF process model has four main concepts that are key to understanding the model:
IT service management, like software development, has a life cycle.
The life cycle is made up of distinct logical phases that run concurrently.
Operations reviews must be both release based and time based.
IT service management touches every aspect of the enterprise.
With this understanding, the MOF process model consists of four integrated phases. They are:
Changing
Operating
Supporting
Optimizing
These phases form a spiral life cycle that can be applied to a specific application, a data center or an entire operations environment with multiple data centers, including outsourced operations and hosted applications.
Each phase culminates with a review milestone specifically tailored to assess the operational effectiveness of the preceding phase. These phases, coupled with their designated review milestones, work together to meet organizational goals and objectives. Figure 6 below illustrates the MOF process model and the relationship of the life cycle phases, the reviews following each phase, and the concept of IT service management at the core of the model. The figure depicts each phase of the IT operation connected in a continuous spiral life cycle.
Figure 6. The MOF Process Model
The process model incorporates two types of review milestones—release based and time based. Two of the four reviews—release readiness and implementation—are release based and occur at the introduction of a release into the target environment. The remaining two reviews—operations and service level agreement—occur at regular intervals to assess the internal operations as well as the customer service levels.
The reason for this mix of review types within the process model is to support two concepts necessary in a successful IT operations environment:
The need to manage the introduction of change through the use of managed releases. Managed releases allow for a clear packaging of change that can then be identified, tracked, tested, implemented, and operated.
The need to continually assess and adapt the operational procedures, processes, tools, and people required to deliver the specific service solutions. The time-based review supports this concept.
The following table summarizes the key activities and subsequent review for each of the four phases:
Phase
|
Activities
|
Review
|
Changing
|
Introduce new service solutions, technologies, systems, applications, hardware, and processes
|
Implementation
|
Operating
|
Execute day-to-day tasks effectively
|
Operations
|
Supporting
|
Resolve incidents, problems, and inquiries quickly
|
Service level agreement
|
Optimizing
|
Optimize cost, performance, capacity, and availability
|
Release readiness
|
The MOF process model promotes a high level of availability, reliability, and manageability. For this reason, IT managers will find the MOF process model useful in the following environments:
Production
Production certification
User acceptance
Prerelease or staging
Integration or system test
|