Monday, 14 April 2014

SQL Server Counters for Monitoring SQL Server Performance

       Counters allow you a method to measure current performance, as well as performance over time. Identifying the metrics you like to use to measure SQL Server performance and collecting them over time gives you a quick and easy way to identify SQL Server problems, as well as graph your performance trend over time.
Below is my top 10 list of SQL Server counters in no particular order. For each counter I have described what it is, and in some cases I have described the ideal value of these counters. This list should give you a starting point for developing the metrics you want to use to measure database performance in your SQL Server environment.
 

1. SQLServer: Buffer Manager: Buffer cache hit ratio

    The buffer cache hit ratio counter represents how often SQL Server is able to find data pages in its buffer cache when a query needs a data page. The higher this number the better, because it means SQL Server was able to get data for queries out of memory instead of reading from disk. You want this number to be as close to 100 as possible. Having this counter at 100 means that 100% of the time SQL Server has found the needed data pages in memory. A low buffer cache hit ratio could indicate a memory problem.

2. SQLServer: Buffer Manager: Page life expectancy

     The page life expectancy counter measures how long pages stay in the buffer cache in seconds. The longer a page stays in memory, the more likely SQL Server will not need to read from disk to resolve a query. You should watch this counter over time to determine a baseline for what is normal in your database environment. Some say anything below 300 (or 5 minutes) means you might need additional memory.
 

3. SQLServer: SQL Statistics: Batch Requests/Sec

    Batch Requests/Sec measures the number of batches SQL Server is receiving per second. This counter is a good indicator of how much activity is being processed by your SQL Server box. The higher the number, the more queries are being executed on your box. Like many counters, there is no single number that can be used universally to indicate your machine is too busy. Today’s machines are getting more and more powerful all the time and therefore can process more batch requests per second. You should review this counter over time to determine a baseline number for your environment.

4. SQLServer: SQL Statistics: SQL Compilations/Sec

       The SQL Compilations/Sec measure the number of times SQL Server compiles an execution plan per second. Compiling an execution plan is a resource-intensive operation. Compilations/Sec should be compared with the number of Batch Requests/Sec to get an indication of whether or not complications might be hurting your performance. To do that, divide the number of batch requests by the number of compiles per second to give you a ratio of the number of batches executed per compile. Ideally you want to have one compile per every 10 batch requests.

5. SQLServer: SQL Statistics: SQL Re-Compilations/Sec

      When the execution plan is invalidated due to some significant event, SQL Server will re-compile it. The Re-compilations/Sec counter measures the number of time a re-compile event was triggered per second. Re-compiles, like compiles, are expensive operations so you want to minimize the number of re-compiles. Ideally you want to keep this counter less than 10% of the number of Compilations/Sec.

6. SQLServer: General Statistics: User Connections

     The user connections counter identifies the number of different users that are connected to SQL Server at the time the sample was taken. You need to watch this counter over time to understand your baseline user connection numbers. Once you have some idea of your high and low water marks during normal usage of your system, you can then look for times when this counter exceeds the high and low marks. If the value of this counter goes down and the load on the system is the same, then you might have a bottleneck that is not allowing your server to handle the normal load. Keep in mind though that this counter value might go down just because less people are using your SQL Server instance.

7. SQLServer: Locks: Lock Waits / Sec: _Total

      In order for SQL Server to manage concurrent users on the system, SQL Server needs to lock resources from time to time. The lock waits per second counter tracks the number of times per second that SQL Server is not able to retain a lock right away for a resource. Ideally you don't want any request to wait for a lock. Therefore you want to keep this counter at zero, or close to zero at all times.

8. SQLServer: Access Methods: Page Splits / Sec

      This counter measures the number of times SQL Server had to split a page when updating or inserting data per second. Page splits are expensive, and cause your table to perform more poorly due to fragmentation. Therefore, the fewer page splits you have the better your system will perform. Ideally this counter should be less than 20% of the batch requests per second.

9. SQLServer: General Statistic: Processes Block

       The processes blocked counter identifies the number of blocked processes. When one process is blocking another process, the blocked process cannot move forward with its execution plan until the resource that is causing it to wait is freed up. Ideally you don't want to see any blocked processes. When processes are being blocked you should investigate.

10. SQLServer: Buffer Manager: Checkpoint Pages / Sec

       The checkpoint pages per second counter measures the number of pages written to disk by a checkpoint operation. You should watch this counter over time to establish a baseline for your systems. Once a baseline value has been established you can watch this value to see if it is climbing. If this counter is climbing, it might mean you are running into memory pressures that are causing dirty pages to be flushed to disk more frequently than normal.

No comments:

Post a Comment