Events are generated by SQL Server and entered into the Microsoft Windows application log. SQL Server Agent reads the application log and compares events written there to alerts that you have defined. When SQL Server Agent finds a match, it fires an alert, which is an automated response to an event. In addition to monitoring SQL Server events, SQL Server Agent can also monitor performance conditions and Windows Management Instrumentation (WMI) events.
To define an alert, you specify:
- The name of the alert.
- The event or performance condition that triggers the alert.
- The action that SQL Server Agent takes in response to the event or performance condition.
Specifying a SQL Server Event
You can specify an alert to occur in response to one or more events. Use the following parameters to specify the events that trigger an alert:
- Error numberSQL Server Agent fires an alert when a specific error occurs. For example, you might specify error number 2571 to respond to unauthorized attempts to invoke Database Console Commands (DBCC).
- Severity levelSQL Server Agent fires an alert when any error of the specific severity occurs. For example, you might specify a severity level of 15 to respond to syntax errors in Transact-SQL statements.
- DatabaseSQL Server Agent fires an alert only when the event occurs in a particular database. This option applies in addition to the error number or severity level. For example, if an instance contains one database that is used for production and one database that is used for reporting, you can define an alert that responds to syntax errors in the production database only.
- Event textSQL Server Agent fires an alert when the specified event contains a particular text string in the event message. For example, you might define an alert that responds to messages that contain the name of a particular table or a particular constraint.
Important Severity Alerts
When implementing this, it is a common practice for DBAs to enable alerts for Severity 17 or higher on their SQL Servers. This is because these are not correctable by end users. Again, I believe everyone should have a monitoring solution in place even if it is just monitoring resources on the machine, for this reason, I only enable them for 18 and above. If you don’t have a monitoring solution though, please enable alerts for 17 also. Error 17 indicates that a statement caused SQL Server to run out of resources.
|Error Severity||What it indicates|
|18||There is a problem with the database engine software.|
|19||Nonconfigurable Database engine limits were exceeded, and the batch was terminated.|
|20||A statement has encountered a problem with the current task, unlikely to cause damage to the database itself.|
|21||A problem was encountered that affects all tasks in the database, unlikely to cause damage to the database itself.|
|22||The table or index specified in the message has been damaged by a software or hardware problem.|
|23||The integrity of the entire database is in question because of a hardware or software problem.|
|24||Media failure. Most likely means a restore of the database and a call to your hardware vendor.|
|25||Unexpected errors, this is the catch all for Microsoft SQL Server.|
Next, it is important to also mention that you should set up alerts for the following error messages 823, 824 and 825. These are signs that your underlying storage system having issues and should be investigated by your system administrator and hardware vendor. Additionally, if you receive these messages as a DBA you should check the suspect pages table in SQL server and run a CHECKDB. This will confirm your state of your database. To query your suspect pages table, use the query below, more details about the event types can be found here.
SELECT db_name(database_id) as database_name, file_id,page_id, error_count, last_update_date, CASE WHEN event_type = '1' THEN 'Error 823 or 824' WHEN event_type = '2' THEN 'Bad checksum on page' WHEN event_type = '3' THEN 'Torn page' ELSE 'restored after was damaged or repaired/deallocated by DBCC' END as Event_information FROM msdb..suspect_pages WHERE (event_type = 1 OR event_type = 2 OR event_type = 3);
To summarize at a high level, alerts should be created for:
- Events with Severity >= 18 if you have a monitoring solution, but if you don’t, enable 17 and above alerts
- These are high-severity errors that should be investigated by the system administrator/DBA
- Error 823,824,825 read-retry errors
- These errors spell doom for your disk subsystem
|Error Number||1205||Deadlock Detected|
|823||An 823 error that was caused by a cyclic redundancy check (CRC) issued by the operating system, such as a disk error (certain hardware errors)|
|824||An 824 error, such as a torn page (any logical error)|
|Severity||016||Miscellaneous User Error|
|018||Nonfatal Internal Error|
|019||Fatal Error in Resource|
|020||Fatal Error in Current Process|
|021||Fatal Error in Database Processes|
|022||Fatal Error: Table Integrity Suspect|
|023||Fatal Error: Database Integrity Suscpect|
|024||Fatal Error: Hardware Error|
RAISERROR('This is a test Severity 16 alert, please ignore.',16,1) WITH LOG;
DESCRIPTION: Error: 50000 Severity: 16 State: 1 This is a test Severity 16 alert, please ignore.
Error:  Database Mail is not enabled for agent notifications. Cannot send e-mail to xxxxxxxx.
Fix: Enable mail in SQL Server Agent (Default is disabled)
SQL Server Agent -> Property -> Alert System -> Enable mail profile
Made sure to setup delay response or you can get a lot of alerts for a single failure.