Saturday, May 8, 2010

Performance Monitor – Base Objects to Monitor

I am always surprised at how many senior engineers do not understand how to use the Windows Server built-in Performance Monitor to determine hardware, software, network bottlenecks...or that have never even used it.   First thing to remember, is that while you are running Performance Monitor that will cause a performance hit to your server too.  Now, obviously you do need to run it to troubleshoot...but you should just perform short captures of less than 30 minutes if possible, during times not used, then average use, then peak times or with forced hits.  Below is the baseline object counters that I always start with, then I eliminate different Performance Objects one by one, by their results, until I have just 2 or three that I have narrowed it down to.  I suggest you create a baseline like this and save it locally so the next time you need to capture you can do so very quickly by just opening and starting it.  I have had this baseline for literally years, I am sure I probably got it from a Microsoft white paper.

Start - Run - type Perfmon   or

Start  - Administrative Tools - Performance - System Monitor. You have 3 display formats; graph, histogram and text report.

Adding Performance Counters

  • Right click the counter pane and select Add Counter
  • Select whether you are monitoring the local computer counters by selecting Use local computer counter option or another remote computer by selecting Select counter from computer option by providing the UNC path
  • Select from the Performance object drop down list
  • Select the counter, these are different according to the selected performance object
  • Click Add to add it in the System monitor
  • Click Explain button if you need more details about the counter

Performance Object

Counter

Threshold

Description

Processor

Percent processor time_total

80 to 85 percent averaged over three intervals

The total percentage of processor usage for a server.

Network interface

Bytes total per second_network interface

50 percent of the available network interface bandwidth -- for example, a 100-MB network interface running at 50,000 KB per second

The rate at which bytes are sent and received over each network adapter.

Logical disk

Percent idle time_ (drives C:,D:, and so on)

20 percent over idle time_.

Reports the percentage of time during the sample interval that the disk was idle. If this value is very low, the logical disk is very busy.

Paging file

Percent usage

Above 70 percent

Review this value in conjunction with memory -- available megabytes and page faults per second -- to understand paging activity on the server.

Memory

Available MBs

128 MB -- assuming 2 GB of RAM as prescribed on servers

The amount of physical memory, in MBs, immediately available for allocation to a process or for system use on the server.

Memory

Page faults per second

20

A high rate of page faults indicates a lack of physical memory.

System

Processor queue length

The number of CPUs + 1

Exceeding the threshold indicates that the processors are not fast.

ASP.NET applications

Requests per second_total

Through ongoing monitoring, trends begin to emerge that equate requests per second with CPU consumption

The number of requests executed per second; this roughly equates to the number of HTTP pages per second.

Disk

Physical Disk\% Free Space

15%

Logical Disk\% Free Space

Disk

Physical Disk\% Disk Time

90%

Logical Disk\% Disk Time

Disk

Physical Disk\Disk Reads/sec, Physical Disk\Disk Writes/sec

Depends on manufacturer's specifications

Check the specified transfer rate for your disks to verify that this rate does not exceed the specifications. In general, Ultra Wide SCSI disks can handle 50 to 70 I/O operations per second. Note that whether or not the I/O is sequential or random can have a strong impact on values for disk reads/sec and disk writes/sec.

Disk

Physical Disk\Current Disk Queue Length

Number of spindles plus 2

This is an instantaneous counter; observe its value over several intervals. For an average over time, use Physical Disk\Avg. Disk Queue Length.

Memory

Memory\Available Bytes

For larger memory computers, greater than 4 MB

Research memory usage and add memory if needed.

Memory

Memory\Pages/sec

n pages/sec per pagefile

Research paging activity. Note the amount of I/O going to the disks with pagefiles.

Paging file

Paging File\% Usage

Above 70%

Review this value in conjunction with Available Bytes and Pages/sec to understand paging activity on your computer.

Processor

Processor\% Processor Time

85%

Find the process that is using a high percentage of processor time. Upgrade to a faster processor or install an additional processor.

Processor

Processor\Interrupts/sec

Depends on processor; 1,000 interrupts per second is a good starting point

A dramatic increase in this counter value without a corresponding increase in system activity indicates a hardware problem. Identify the network adapter, disk, or other hardware causing the interrupts.

Server

Server\Bytes Total/sec

If the sum of Bytes Total/sec for all servers is roughly equal to the maximum transfer rates of your network, you may need to segment the network.

Server

Server\Work Item Shortages

3

If the value reaches this threshold, consider adding the DWORD entries InitWorkItems (the number of work items allocated to a processor during startup) or MaxWorkItems (the maximum number of receive buffers that a server can allocate) to the registry (under HKEY_LOCAL_MACHINE\SYSTEM\

CurrentControlSet\Services\LanmanServer

\Parameters). The entry InitWorkItems can range from 1 to 512 while MaxWorkItems can range from 1 to 65,535. Start with any value for InitWorkItems and a value of 4,096 for MaxWorkItems and keep doubling these values until the Server\Work Item Shortages threshold stays below 3. For information about modifying the registry, see Registry Editor overview .

Caution

Server

Server\Pool Paged Peak

Amount of physical RAM

This value is an indicator of the maximum paging file size and the amount of physical memory.

Server

Server Work Queues\Queue Length

4

If the value reaches this threshold, there may be a processor bottleneck. This is an instantaneous counter; observe its value over several intervals.

Multiple processors

System\Processor Queue Length

2

This is an instantaneous counter; observe its value over several intervals.

No comments:

Post a Comment