• Performance Tuning for SAP Sales and Distribution Two-Tier Workload
  • Operating System Tunings on the Server
  • Tunings on the Database Server
  • Tunings on the SAP Application Server
  • Monitoring and Data Collection
  • Performance Tuning for TPC-E Workload
  • Server Under Test (SUT) Tunings
  • SQL Server Tunings
  • Performance Tuning Guidelines for Windows Server 2008 R2 October 15, 2010




    Download 0.66 Mb.
    bet18/19
    Sana26.12.2019
    Hajmi0.66 Mb.
    #5293
    1   ...   11   12   13   14   15   16   17   18   19

    Monitoring and Data Collection


    The following list of performance counters is considered a base set of counters when you monitor the resource usage on the RDS workload. Log the performance counters to a local, raw (blg) performance counter log. It is less expensive to collect all instances (‘*’ wide character) and then extract particular instances while post-processing by using Relog.exe:

    \Cache\*
    \IPv4\*


    \LogicalDisk(*)\*
    \Memory\*
    \Network Interface(*)\*
    \Paging File(*)\*
    \PhysicalDisk(*)\*
    \Print Queue(*)\*
    \Process(*)\*
    \Processor Information(*)\*
    \Synchronization(*)\*
    \System\*
    \TCPv4\*

    Note: If applicable, add the \IPv6\* and \TCPv6\* objects.

    Stop unnecessary ETW loggers by running logman.exe stop -ets


    . To view providers on the system, run logman.exe query -ets.

    Use Logman.exe to collect performance counter log data instead of using Perfmon.exe, which enables logging providers and increases CPU usage.


    Performance Tuning for SAP Sales and Distribution Two-Tier Workload


    SAP AG has developed several standard application benchmarks. The Sales and Distribution (SD) workload represents one of the important classes of workloads that are used for benchmarking SAP enterprise resource planning (ERP) installations. For more information on obtaining the benchmark kit, see the link to the SAP web page in “Resources” later in this guide.

    SAP updated the SAP SD workload in January, 2009. The updates include added requirements such as subsecond response time and a Unicode codepage. For more information, see the link to the SAP web page in “Resources” later in this guide.

    You can perform multidimensional tuning of the operating system level, application server, database server, network, and storage to achieve optimal throughput and good response times as the number of concurrent SD users increases before capping out because of resource limitations.

    The following sections provide guidelines that can benefit the two-tier setup specifically for SAP ERP SD benchmarks on Windows Server 2008 R2. Some of these recommendations might not apply to the same degree for production systems.


    Operating System Tunings on the Server


    • Navigate to Control Panel > System > Advanced System Settings > Advanced tab and configure the following:

    Navigate to Performance Settings > Advanced > Virtual memory and set one or more fixed-size pagefiles (Initial Size equal to Maximum Size). The pagefile size should meet the total virtual memory requirements of the workload. Make sure that no system-managed pagefiles are in the Virtual memory on the Application Server.

    Navigate to Performance Settings > Visual Effects and select the Adjust for best performance check box.



    • To enable SQL to use large pages, enable the Lock pages in memory user right assignment for the account that will run the SQL and SAP services.

    From the Group Policy MMC snap-in (Gpedit.msc), navigate to Computer Configuration > Windows Settings > Security Settings > Local Policies > User Rights Assignment. Double-click Lock pages in memory and add the accounts that have credentials to run Sqlservr.exe and SAP services.

    • Disable User Account Control.

    Navigate to Start > All Programs > Administrative Tools > System Configuration > Tools tab, select Disable UAC, and then reboot the system. This setting can be used for benchmarking environments, but enabling UAC might be a security compliance requirement in production environments.

    Tunings on the Database Server


    When the database server is SQL Server®, consider setting the following SQL Server configuration options with sp_configure. For detailed information on the sp_configure stored procedure, see the information about setting server configuration options in "Resources" later in this guide.

    • Apply CPU affinity for the SQL Server process: Set an affinity mask to partition the SQL process on specific cores. If required, use the affinity64 mask to set the affinity on more than 32 cores. Starting with SQL Server 2008 R2, you can apply equivalent settings for configuring CPU affinity on as many as 256 logical processors by using the ALTER SERVER CONFIGURATION SET PROCESS AFFINITY Data Definition Language (DDL) TSQL statement as the sp_configure affinity mask options are announced for deprecation. For more information on DDL, see  “Resources” later in this guide. For the current two-tier SAP SD benchmarks, it is typically sufficient to run SQL Server on one-eighth or fewer of the existing cores.

    • Set a fixed amount of memory that the SQL Server process will use. For example, set the max server memory and min server memory equal and large enough to satisfy the workload (2500 MB is a good starting value).

    On NUMA-class hardware, you can do the following:



    • To further subdivide the CPUs in a hardware NUMA node to more CPU nodes (known as Soft-NUMA), see the information about configuring SQL Server to use Soft-NUMA in "Resources" later in this guide.

    • To provide NUMA node locality for SQL Server, set preferred NUMA node hints (applies to Windows Server 2008 R2 and later). For the commands below, use the service name. The [server] parameter is optional, the other parameters are required:

    Use the following command to set the preferred NUMA node:

    %windir%\system32\sc.exe [server] preferrednode


    You need administrator permissions to set the preferred node. Use %windir%\system32\sc.exe preferrednode to display help text.

    Use the following command to query the setting:

    %windir%\system32\sc.exe [server] qpreferrednode
    This command fails if the service has no preferred node settings. Use %windir%\system32\sc.exe qpreferrednode to display help text.

    Use the following command to remove the setting:



    %windir%\system32\sc.exe [server] preferrednode -1
    On a two-tier ERP SAP setup, consider enabling and using only the Named Pipes protocol and disabling the rest of the available protocols from the SQL Server Configuration Manager for the local SQL connections.

    Tunings on the SAP Application Server


    • The ratio between the number of Dialog (D) processes versus Update (U) processes in the SAP ERP installation might vary, but usually a ratio of 1D:1U or 2D:1U per logical processor is a good start for the SD workload. Ensure that in a SAP dialog instance, the number of worker processes and users does not exceed the capacity of the SAP dispatcher for that dialog instance (the current maximum is approximately 2,000 users per instance). On NUMA-class hardware, consider installing one or more SAP dialog instances per NUMA node (depending on the number of logical processors per NUMA node that you want to use with SAP worker processes). The D:U ratio, and the overall number of SAP dialog instances per NUMA node or system wide, might be improved based on the analysis of previous experiments.

    • To further partition within an SAP instance, use the processor affinity capabilities in the SAP instance profiles to partition each worker process to a subset of the available logical processors and achieve better CPU and memory locality. Affinity setting in the SAP instance profiles is supported for as many as 64 logical processors.

    • Use the FLAT memory model that SAP AG released on November 23, 2006, with the SAP Note No. 1002587 “Flat Memory Model on Windows” for SAP kernel 7.00 Patch Level 87.

    • Windows Server 2008 R2 supports more than 64 logical processors. On such NUMA-class systems, consider setting preferred NUMA nodes in addition to setting hard affinities by using the following steps:

    1. Set the preferred NUMA node for the SAP Win32 service and SAP Dialog Instance services (processes instantiated by Sapstartsrv.exe). When you enter commands on the local system, you can omit the server parameter. For the commands below, use the service short name:

    • Use the following command to set the preferred NUMA node:

    %windir%\system32\sc.exe [server] preferrednode
    You need administrator permissions to set the preferred node. Use %windir%\system32\sc.exe preferrednode to display help text.

    • Use the following command to query the setting:

    %windir%\system32\sc.exe [server] qpreferrednode
    This command fails if the service has no preferred node settings. Use %windir%\system32\sc.exe qpreferrednode to display help text.

    • Use the following command to remove the setting:

    %windir%\system32\sc.exe [server] preferrednode -1


    1. To allow each SAP worker process in a dialog instance to inherit the ideal NUMA node from its Win32 service, create registry key entries under the following key for each of the Sapstartsrv.exe, Msg_server.exe, Gwrd.exe, and Disp+work.exe images and set the "NodeOptions"=dword:00000100 value:

    HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\ (IMAGE NAME)\ (REG_DWORD)


    1. If the preferred NUMA node is used without hard affinity settings for SAP worker processes, or if time measurement issues are observed as described by SAP Note No. 532350 released on November 29, 2004, apply the recommendation to let SAP processes use the Query Performance Counter (QPC) timer to stabilize the benchmark environment. Set the following system environment variable:

    %windir%\system32\setx.exe /M SAP_USE_WIN_TIMER YES


    1. If applicable, use the IntPolicy tool as described in the “Interrupt Affinity” section earlier in this guide to set an optimal interrupt affinity for storage or network devices.

    You can use the Coreinfo tool from Windows Sysinternals to provide topology details about logical and physical processors, processor sockets, NUMA nodes, and processor cache. For more information, see “Resources” later in this guide.

    Monitoring and Data Collection


    The following list of performance counters is considered a base set of counters when you monitor the resource usage of the Application Server while you are running the two-tier SAP ERP SD workload. Log the performance counters to a local, raw (blg) performance counter log. It is less expensive to collect all instances (‘*’ wide character) and then extract particular instances while post-processing by using Relog.exe:

    \Cache\*
    \IPv4\*


    \LogicalDisk(*)\*
    \Memory\*
    \Network Interface(*)\*
    \Paging File(*)\*
    \PhysicalDisk(*)\*
    \Process(*)\*
    \Processor Information(*)\*
    \Synchronization(*)\*
    \System\*
    \TCPv4\*
    \SQLServer:Buffer Manager\Lazy writes/sec

    Note: If applicable, add the \IPv6\* and \TCPv6\* objects.

    Performance Tuning for TPC-E Workload


    TPC-E online transaction processing (OLTP) is one of the primary database workloads used to evaluate SQL Server and Windows Server performance. TPC-E uses a central database that executes transactions related to a brokerage firm’s customer accounts. The primary metric for TPC-E is Trade-Result transactions per second (tpsE). Note that Trade-Result transactions account for 10% of the transaction mix. For more information about the TPC-E benchmark, see the TPC-E website listed in “Resources” later in this guide.

    A non-clustered TPC-E benchmark setup consists of two parts: a set of client systems and the server under test (SUT). To achieve maximum system utilization and throughput, you can tune the operating system, SQL Server, storage, memory, processors, and network. This section describes configuration guidelines for achieving optimal TPC-E performance.


    Server Under Test (SUT) Tunings


    Use the following SUT tunings:

    • Set the power scheme to High Performance.

    • Configure pagefiles for best performance:

    Navigate to Performance Settings > Advanced > Virtual memory and configure one or more fixed-size pagefiles with Initial Size equal to Maximum Size. The pagefile size should be equal to the total virtual memory requirement of the workload. Make sure that no system-managed pagefiles are in the virtual memory on the application server.

    Navigate to Performance Settings > Visual Effects and select Adjust for best performance.




    • To enable SQL Server to use large pages, enable the Lock pages in memory user right assignment for the account that will run the SQL Server:

    From the Group Policy MMC snap-in (Gpedit.msc), navigate to Computer Configuration > Windows Settings > Security Settings > Local Policies > User Rights Assignment. Double-click Lock pages in memory and add the accounts that have credentials to run SQL Server.


    • Configure network devices:

    The number of network devices is determined from previous runs. Network device utilization should not be higher than 65%-75% of total NIC bandwidth. Use 1-Gbps NICs at minimum.

    From the Device Manager MMC snap-in (Devmgmt.msc), navigate to Network Adapters and determine the network devices to be used. Disable devices that are not being used.

    If interrupt partitioning is necessary in high interrupt rates per NIC port scenarios, and the device supports interrupt affinity configuration, set network device interrupt affinity:


        • Using the IntPolicy tool, set interrupt affinity in a round-robin fashion starting from processor 0. If the SUT is a multinode system, determine on which nodes the NICs reside and set the affinity to processors that belong to the node on which each NIC resides. For detailed information on the IntPolicy tool, see "Resources" later in this guide.

    For advanced network tuning information, see “Performance Tuning for the Networking Subsystem” earlier in this guide.




    • Configure storage devices:

    If the operating system is Windows Server 2008 R2, DPC redirection optimization is available on some storage drivers. If the storage device driver supports DPC redirection optimization, there is no need to set interrupt affinity on storage devices. If the storage device driver does not support DPC redirection, or if storage device driver interrupts are not distributed to processors on the same NUMA node where the device resides, set the interrupt affinity for each device by using IntPolicy as advised for networking devices.

    For advanced storage tuning information, see “Performance Tuning for the Storage Subsystem” earlier in this guide.




    • Configure disks for advanced performance:

    From the Disk Management MMC snap-in (Diskmgmt.msc), select each disk in use, right-click to Properties > Policies and select Advanced Performance if it is enabled for the disk.

    SQL Server Tunings


    Use the following SQL Server tunings:

    • In a benchmark environment, you can use the -T834 start flag to enable SQL Server to use large pages. The use of large pages is not generally recommended outside of benchmarking environments, but overall performance improvements have been observed when applied.

    • If you disable SQL Server performance counters to avoid potential overhead, start SQL Server as a process instead of a service and use the -x flag:

    1. From the Services MMC snap-in (Services.msc), stop and disable SQL Services.

    2. Execute the following command from the SQL Server Binn directory:

    sqlservr.exe –c –x


    • Enable the TCP/IP protocol and consider disabling other protocols:

    • Navigate to Start Menu > Programs > Microsoft SQL Server R2 > Configuration Tools > SQL Server Configuration Manager. Then navigate to SQL Server Network Configuration > Protocols for MSSQL Server, right-click TCP/IP, and click Enable.




    • Configure SQL Server according to the guidance in the following list. You can configure SQL Server by using the sp_configure stored procedure. Set the show advanced options value to 1 to display more available configuration options. Detailed information about the sp_configure stored procedure is available in “Resources” later in this guide.

    Set CPU affinity for the SQL process: Set affinity mask to partition the SQL process on specific cores. To set affinity on more than 32 logical processors, use affinity64 mask. Starting with SQL Server 2008 R2, you can apply equivalent settings for configuring CPU affinity on as many as 256 logical processors using the ALTER SERVER CONFIGURATION SET PROCESS AFFINITY Data Definition Language (DDL) TSQL statement as the sp_configure affinity mask options are announced for deprecation. Use the ‘alter server configuration set process affinity cpu =’ command to set affinity to the desired range of processors for each k-group, separated by comma. For more information on DDL, see “Resources” later in this guide.

    If network device interrupt affinity was configured, the LPs to which you partitioned interrupts should not be used to run SQL Server threads.

    You can set a fixed amount of memory for the SQL Server process to use. About 3% of the total available memory is used for the system, and another 1% is used for memory management structures. SQL Server can use the rest of available memory, but not more.

    The following equation is available to calculate total memory to be used by SQL Server:

    TotalMemory – (1%memory * (numa_nodes)) – 3%memory – 1GB memory

    Leave the lightweight pooling value set to the default of 0. This enables SQL Server to run in threads mode. Threads mode performance is comparable to fibers mode.

    If it appears that the default settings do not allow sufficient concurrent transactions, set the max worker threads value to approximately the number of connected users. Monitor the sys.dm_os_schedulers DMV to determine whether you need to increase the number of worker threads.

    Set the awe enabled value to 1.

    In benchmark environments, set the default trace enabled value to 0. This is not recommended in production environments, because it reduces the ability to diagnose problems.

    Set priority boost value to 1.

    Set allow updates value to 1.



    Download 0.66 Mb.
    1   ...   11   12   13   14   15   16   17   18   19




    Download 0.66 Mb.

    Bosh sahifa
    Aloqalar

        Bosh sahifa



    Performance Tuning Guidelines for Windows Server 2008 R2 October 15, 2010

    Download 0.66 Mb.