• subscribe
August 17, 2009 12:00 AM

TraceErrors Process

This process watches for events and sends you a daily email so you can stay on top of your servers.
SQL Server Pro
InstantDoc ID #102445
Downloads
102445_2.zip
102445sqlsig.zip

TraceErrorsReport
In the TraceErrorsReport stored procedure I use sqlsig, a CLR user-defined function (UDF) created by Itzik Ben-Gan. With help from Ray Maor of DBSophic, I modified sqlsig to remove some unwanted characters, such as CR and LF (ASCII 13 and 10), that caused problems rendering the tabular result set into the HTML report used by the calling application.

The UDF works on the TextData column of the rows generated by the RPC:Starting and SQL:BatchStarting events. It replaces specific characters, such as parameters, with a hash sign, which lets it aggregate the data, preventing the same error from showing up too many times in the report.

The @Rows input parameter passed to the TOP clause of the SELECT statement allows you to control the number of rows returned. Note that the returned result set isn't sorted in any order. My first choice would be to sort the results by the StartTime column, but to do so I would have to include the StartTime column in the SELECT list. Including StartTime would make the DISTINCT operator totally worthless, because a SELECT statement that uses the DISTINCT operator requires the columns in the ORDER BY clause to be part of the SELECT list.

I didn't include important columns such as StartTime, EventSequence, and SPID for the sake of aggregation. However, this data is captured and available at in the TraceErrorsFull table, where it remains until the next time the procedure runs, so the data is available to you while you're investigating a specific case.

The MIN_StartTime and MAX_StartTime columns come from TraceErrorsFull, which holds the raw trace data. I included them in the report only so that you know the timeframe of events that the report worked on. You can omit these columns or modify the procedure to return an additional result set with just these two columns. The Hours column shows the difference in hours between these two columns.

TraceErrorsStatus
The TraceErrorsStatus stored procedure is aimed at making sure the trace is running and active. The procedure is invoked by a job that has two schedules. The first schedule executes the stored procedure when the SQL Server Agent service is started, which results in the trace being started. The second schedule runs every hour to make sure that the trace exists and is in an active state. Alternately, the procedure can be invoked by a startup stored procedure instead of the first schedule. This eliminates the SQL Server Agent service as a point of failure and starts the trace a little earlier.

Once your trace is created, you can view its state and additional information by querying the system views. Use the query in Listing 1 as a starting point.

Listing 1: Viewing Trace Metadata
-- Get all traces
SELECT
	 [id], , event_count, dropped_event_count, [path]
	,max_size, max_files, is_rollover, is_shutdown, buffer_count
	,buffer_size, file_position, start_time, last_event_time
FROM sys.traces;


-- Get Events
SELECT
	 e.name AS [event]
	,c.name AS [column]
FROM fn_trace_geteventinfo(2) ei --< edit TraceId
INNER JOIN sys.trace_events e ON ei.eventid = e.trace_event_id
INNER JOIN sys.trace_columns c ON ei.columnid = c.trace_column_id;


-- Get Filters
SELECT
	 columnid
	,c.name AS [column]
	,logical_operator
	,comparison_operator
	,\[value]
FROM fn_trace_getfilterinfo(2) ei  --< edit TraceId
INNER JOIN sys.trace_columns c ON ei.columnid = c.trace_column_id;


-- Get SubClass values
SELECT
	 c.name AS [column]
	,e.name AS [event]
	,s.subclass_value
	,s.subclass_name
FROM sys.trace_columns c
INNER JOIN sys.trace_subclass_values s ON c.trace_column_id = s.trace_column_id
INNER JOIN sys.trace_events e ON e.trace_event_id = s.trace_event_id
WHERE e.name LIKE 'Audit Login';


-- Get Events / Columns combination
SELECT
	 e.name AS [event]
	,c.name AS [column]
FROM sys.trace_event_bindings b
INNER JOIN sys.trace_events e ON e.trace_event_id = b.trace_event_id
INNER JOIN sys.trace_columns c ON c.trace_column_id = b.trace_column_id
WHERE e.name LIKE 'Audit Login'
ORDER BY e.name;


Why Capture?
Capturing SQL exceptions at the very lowest level, the database engine, is an important aspect of monitoring production systems. Capturing the exceptions provides you with your own set of logs so that you don't need to rely only on logs generated by applications, which can sometimes be unreliable or, in some environments, inaccessible to DBAs. Capturing RPC:Starting and SQL:BatchStarting events is truly a necessity on production systems, where a DBA needs to find out quickly what happened when things go wrong. Knowing what happened is an important step in preventing the same problem from being repeated.

Related Reading:

Related Microsoft Articles:


ARTICLE TOOLS

Comments
  • Yaniv
    3 years ago
    Oct 05, 2009

    Hi, I understand your concern that the process may affect your server performance however this is not the case.
    If you would run such a trace using Profiler then you get in to trouble but not if you use the scripts provided in this article. I take it a step forward and make a statement that if you implement this process not only you are not risking your server's performance but you will not be able to measure the impact on your system, that is the foot print can hardly be measured because it is that small.

    The explanation to the different behaviors Profiler vs. Server Side trace (implemented at this process) is that SQL Trace is a database engine technology that exposes 2 providers: the rowset provider and the file provider. The rowset provider used by Profiler is much "heavier" than the file provider. The file provider has also many other advantages over the rowset provider.
    Profiler, however is always great for quick and dirty work but has to be used with caution.

    For additional information about SQL Trace you may download a presentation I have that is just about that: http://sqlserverutilities.com/download/Tracing_SQL_Server.zip

    -Yaniv

  • Vincent
    3 years ago
    Sep 22, 2009

    Looks useful, but doesn't running another trace all the time create significant load on the server? 99% of the problems I have to find and fix are not error related, but are performance related (whether it be bad code or bad DB design). So, implementing something that could further negatively impact performance doesn't seem too wise.

  • SHAHAR
    3 years ago
    Sep 06, 2009

    Great article, very practical. Already implemented at my company.
    One correction: In procedure TraceErrorsCreate I had to modify the path to the trace files location that match my server.

You must log on before posting a comment.

Are you a new visitor? Register Here