Friday, October 26, 2012

SQL Sentry v7.2 does Windows (and a whole lot more)

Since releasing SQL Sentry v7 with Fragmentation Manager in May, we've been hard at work on v7.2. Development ran concurrently with Plan Explorer PRO, which was released last week. v7.2 includes two exciting new products: Performance Advisor for Windows and Event Manager for Windows. The release candidate was recently published, and you can download it from our customer portal. (Existing customers please contact us for eval licenses.)

The new features fall into three main categories: Query Plan Analysis, Windows Monitoring, and History/Alert Filtering.

Query Plan Analysis

As you may have guessed, the new version of Performance Advisor for SQL Server includes all Plan Explorer PRO features! Read about those in my last post.

Windows Monitoring

One of the most common requests we've received over the years has been for the ability to monitor any Windows computer, such as one running SharePoint, IIS, SSRS, or SSIS services, but no SQL Server or SSAS services. Performance Advisor for SQL Server has actually always had Windows monitoring built in – it's effectively the left-hand side of the dashboard – the problem was, you only got it if you also had SQL Server or SSAS installed on the machine.

I'm happy to say that not only has this restriction been removed in the new Performance Advisor for Windows, but when combined with the new Event Manager for Windows you now have truly unprecedented capabilities for monitoring Windows performance. Keep reading for a rundown.

Service & Process-level Metrics
To date, probably the most common way to find out what's happening with Windows processes has been via Windows Task Manager. Process Explorer is a more robust tool that is also used, however, both tools suffer from some critical shortcomings:
  1. It's not easy to tell which Windows services are associated with which processes. Task Manager now has "Go to Service|Process" context menus, but in practical use they are not very helpful. Process Explorer requires a few clicks to view associated services. Most importantly, both tools are "one-process-at-a-time".
  2. Lack of historical data. Process Explorer has some limited charting, but neither tool can show which process(es) brought a server to its knees 3 hours ago, let alone view the data in context with what was happening with SQL Server or other key metrics at the time.
  3. No way to easily get performance metrics for individual Windows services, or combined metrics for multiple related processes/services. SharePoint and IIS in particular utilize several processes and services, and the problem may actually be the cumulative impact.

Performance Advisor for Windows addresses all of these issues by:

  1. Auto-correlating processes with services.
  2. Providing historical performance data for processes and services.
  3. Organizing related processes into friendly groups, with individual and group level metrics.

In the dashboard shot below, the tannish series represents all SharePoint processes and services, and by hovering over it I can instantly see that two SharePoint search processes together were mostly responsible for the CPU spike:

cpu_processes_sample

If I want full details, I can simply select the range and then Jump To > Processes:

cpu_jumptoprocesses

...which takes me directly to the new Processes tab, where I have a full list of all active processes and metrics for the range, organized by service and group:

processes_tab(click to enlarge)

It's just as easy to find the largest memory consumers – here there are several IIS worker processes consuming over 1GB of RAM, and causing memory pressure for a SQL Server instance on the same machine (not a recommended configuration!):

memory_processes

Several groups come preconfigured, including SharePoint, IIS, SSIS, SSRS, and SQL Sentry, and we'll be expanding this list as time goes on. You can also easily add new groups yourself – more on that in a future post.

If, like me, you've spent way too much time using the "old school" ways of troubleshooting process/service-related performance issues, these new features should be a godsend.

Processor Groups + NUMA Support

With Windows Server 2008 R2, Microsoft introduced processor groups as part of support for more than 64 processors on a single computer. Lately we've been seeing more and more of these systems come online for both multi-instance SQL Server environments, as well as VMWare or Hyper-V hosts running tens or hundreds of VMs.

Previously the PA dashboard maxed out at 40 processors due to limitations in how perflib reported instance level data. Now the dashboard will show utilization across all processors on a system. Here's a shot of an 80-processor system (4 x 10 cores, hyperthreaded):

cpu_numa_pgs(click to enlarge)

If you look closely, you'll see that the NUMA node associated with each processor is indicated using the format <NUMA Node ID>:<Processor ID>. This system has 4 NUMA nodes, and here only nodes 1 and 3 are active. This information can be invaluable for evaluating how your NUMA resources are being utilized, and for judging the effectiveness of configured affinity settings.

Windows Event Log Monitoring

We've rolled the former Event Manager for Task Scheduler into the new Event Manager for Windows product, and added Windows event log monitoring, another hugely popular request!

eventlogs

You can selectively enable/disable monitoring for the Application, Security and System logs individually. Once enabled, events will show up on the calendar and become available for alerting via the new Windows Event Log: Event condition. By default, only the Application and System logs are watched by SQL Sentry, and only events from SQL Server or SQL Sentry services are collected using the new history filters.

History and Alert Filtering

Which leads me to our next topic: filtering. There are two types of filters – history (or source) filters and alert (or condition) filters. They're both configured the exact same way.

History Filtering

Many event sources now have a configurable filter which allows you to tell SQL Sentry exactly which events you want to collect and store from that source, giving you much greater control over associated storage requirements, which events will show up on the calendar, and which are available for alerting. You'll find this under Settings > History Filter for each source. This shot shows the new Windows Event Logs Source filter:

source_filter

This is the default filter (as mentioned above) which collects only events for the SQL Sentry and SQL Server applications, and only non-backup events (since you'll often already have the associated job and Top SQL events on the calendar). We've made this pretty restrictive to start since the Windows logs can get flooded with events from various sources, and we didn't want to overload you with stuff you may not want to see. That said, you are free to modify these filters to your heart's content ;-)

Alert Filtering

You can think of alert filtering as a layer on top of history filtering. For example, using the history filter you may want to collect all SQL Server related events from the Windows logs and see them on the calendar, but you only want to be alerted for errors. This is trivial to configure by selecting the Condition Settings tab for the Windows Event Log: Event condition:

condition_filter

A really cool aspect of filter configuration is that you pick from a list of filterable fields applicable to that particular source, and if the possible values are known, you pick from those too. Here are some other common use cases:

  • Only send an email for deadlocks NOT caused by a particular application (ahem, SharePoint ;-)
  • Only send an SNMP trap to SCOM for SQL Agent job failures for jobs in a certain category
  • Only alert for Top SQL failures coming from a specific host machine

These examples are only scratching the surface of what's possible with the new filtering. I'm always interested in hearing how new features like this are put to use, so please let me know about your favorite filters.

More than Windows?

Although I've been using "Windows monitoring" as a general term, these new features are especially applicable to some specific scenarios such as managing a SharePoint Server farm where you have different services split across many servers. You'll still have one or more SQL Servers playing a large role in farm performance, of course, but your visibility won't be limited to those servers... for the first time you can have a complete picture of performance across the entire farm!

We've published pricing for Performance Advisor for Windows, Event Manager for Windows, and the combined Power Suite for Windows here. The full product pages should be live by early next week.

We'll be showing off v7.2 in person over the next two weeks at DevConnections and then the PASS Summit, so please stop by our booth for a full tour!

No comments:

Post a Comment