Recommendations
This section provides general performance and capacity recommendations. Use these recommendations to determine the capacity and performance characteristics of the starting topology that you created and to decide whether you have to scale out or scale up the starting topology.
Hardware recommendations
For specific information about minimum and recommended system requirements for both front-end Web server and application servers, see Hardware and Software Requirements (SharePoint Server 2010).
Determining what an optimal deployment should look like depends heavily on expected user count, how heavy the usage is expected to be, and the type of usage. As a starting point for your own deployments, consider the following guidelines:
Application servers and front-end Web servers
|
Processor(s)
|
2 quad core @2.33 GHz
|
RAM
|
16 GB
|
Operating system
|
Windows Server 2008, 64 bit
|
Size of the SharePoint drive
|
3x146GB 15K SAS (3 RAID 1 Disks)
Disk 1: operating system
Disk 2: Swap and BLOB Cache
Disk 3: Logs and Temp directory
|
Number of NICs
|
2
|
NIC Speed
|
1 GBt
|
Authentication
|
NTLM
|
Load balancer type
|
Hardware load balancing
|
Average number of daily unique visitors
|
Average concurrent users
|
Recommended topology
|
100
|
10
|
1 front-end Web server, 1 application server
|
1000
|
30
|
2 front-end Web servers, 2 application servers
|
10000
|
300
|
4 front-end Web servers, 3 application servers
|
Note that for heavy usage of the PowerPoint Broadcast feature, a separate server farm is recommended (as detailed in Configure Broadcast Slide Show performance).
To see a specific example of a departmental deployment and its results, see the SharePoint 2010 Capacity Planning Case Study: Departmental Collaboration.
Scaled-up and scaled-out topologies
To increase the capacity and performance of one of the starting-point topologies, you can do one of two things. You can either scale up by increasing the capacity of your existing server computers or scale out by adding additional servers to the topology. This section describes the considerations to keep in mind between scaling up and scaling out when it comes to adding capacity for the Office Web Apps.
The efficiency of the application server drops significantly once more then 8 CPU cores are made available. The use of global locks prevents the addition of extra cores from being used effectively – scaling out is most effective once this limit has been reached.
For heavy Word Web App viewing use cases, the application server is CPU bound. Adding more cores (subject to the limit mentioned above) or more machines is the best way to add capacity.
For heavy Word Web App editing use cases and heavy OneNote Web App editing or viewing use cases, the front-end Web servers are CPU bound. To scale up, add CPU capacity to front-end Web servers. Note that with enough CPU capacity, eventually heavy editing usage will result in the front-end Web servers becoming memory bound. In this case, scaling out will result in the best performance gains.
For heavy PowerPoint Web App viewing use cases, the application server will be CPU bound, while the front-end Web servers will be memory bound.
For heavy PowerPoint Web App editing use cases, the application server will be memory bound.
For heavy PowerPoint Broadcast use cases, the front-end Web servers will be CPU bound.
In general, the Web Apps are designed such that scaling out results in a more robust, fail safe environment then scaling up does. While deployments on small numbers of big iron machines will work well, it is recommended to go with a scaled out deployment topology with multiple application servers and web front ends as necessary to handle the expected load.
When scaling out, the optimum ratio of application servers to front-end Web servers will vary based on the number of requests for pre-rendered cached documents, with the guidance that it should never be more than 1 front-end Web server to 4 application servers, the ratio determined in our lab as optimum when all document requests resulted in a new render. Also note that when scaling out, it is recommended that affinity be enabled between clients and front ends, as this will ensure optimum performance, particularly with the PowerPoint Web App.
Common bottlenecks and their causes
During performance testing, several different common bottlenecks were revealed. A bottleneck is a condition in which the capacity of a particular constituent of a farm is reached. This causes a plateau or decrease in farm throughput.
The following table lists some common bottlenecks and describes their causes and possible resolutions.
Troubleshooting performance and scalability
Bottleneck
|
Cause
|
Resolution
|
Application server CPU utilization
|
When an application server is overloaded with requests, average CPU utilization will approach 100 percent. This prevents the application servers from responding to requests quickly and can cause timeouts and error messages on client computers.
|
Application servers can have up to 8 cores available to them, beyond which additional application servers should be made available. It is recommended that documents should rarely if ever be entering the queue during peak usage, and as a best practice peak CPU usage should be kept below 70 percent.
|
Web server CPU utilization
|
When a Web server is overloaded with user requests, average CPU utilization will approach 100 percent. This prevents the Web server from responding to requests quickly and can cause timeouts and error messages on client computers.
|
This issue can be resolved in one of two ways. You can add additional Web servers to the farm to distribute user load, or you can scale up the Web server or servers by adding higher-speed processors.
|
Performance monitoring
To help you determine when you have to scale up or scale out your system, use performance counters to monitor the health of your system. Use the information in the following tables to determine which performance counters to monitor, and to which process the performance counters should be applied.
Web servers
The following table shows performance counters and processes to monitor for Web servers in your farm.
Performance counter
|
Apply to object
|
Notes
|
Processor time
|
Total
|
Shows the percentage of elapsed time that this thread used the processor to execute instructions.
|
Memory utilization
|
Application pool
|
Shows the average utilization of system memory for the application pool. You must identify the correct application pool to monitor.
The basic guideline is to identify peak memory utilization for a given Web application, and assign that number plus 10 to the associated application pool.
|
Conversion Queued Requests
|
Application servers
|
Shows the number of rendering requests that have been queued until the Application server is able to service the request.
This should be steady at 0 – consistently having requests in the queue indicates that the application servers are not able to keep up with the request load, meaning delayed responses for users.
To address this, either add more application servers, and/or consider increasing the maximum number of Web App viewing worker processes. This will allow the farm to service more requests per application server, but may result in increased CPU usage on the application server. You can increase the maximum number of Web App viewing worker processes to equal the number of CPU cores available on the application server, to a maximum of 8 (additional cores may not result in additional throughput gains).
|
Conversion Request Frontend Cache Misses
|
Front-end Web servers
|
Shows the number of viewing requests that are not serviced by the web front end’s cache.
This should be relatively low – consistently having cache misses causes more requests on the SQL Server store. This might be remedied by implementing front end affinity or increasing the size of the frontend cache. This cache is 75 MB by default, while a cache size of 2 GB is recommended if the web front ends have spare memory.
|
Conversion Request Average Download Time & Edit Average Download Time
|
Application servers & SQL Server store
|
Shows the length of time to download documents to the application server from the SQL Server store before they are rendered by the Web App.
This should be relatively low – long download times will block the user from viewing the document they are trying to access. This might be caused by availability problems on the application server or the SQL Server store.
|
OneNote Editor Page Load & Word Editor Page Load
|
Front-end Web servers
|
For front-end Web servers that service Word and OneNote editing requests, this shows the amount of time necessary to load the page. Large spikes indicate not enough front-end Web servers to handle the load.
|
Broadcast GetData Rate
|
Front-end Web servers
|
Shows a rough indication of the number of users viewing PowerPoint Broadcasts.
A high rate of BroadcastGetData requests indicates that a large number of users are viewing PowerPoint Broadcasts. This means PowerPoint Broadcast viewers may be using more frontend CPU time than desired. This may have an effect on overall server performance if total CPU usage is high.
|
|