閱讀423 返回首頁    go 阿裏雲 go 技術社區[雲棲]


MSSQL · 應用案例 · 構建死鎖自動收集係統

摘要

這篇文章介紹SQL Server的一個典型的應用案例,即如何利用Event Notification與Service Broker技術相結合來實現死鎖信息自動收集係統。通過這個係統,我們可以全麵把控SQL Server數據庫環境中所有實例上發生的死鎖詳細信息,供我們後期分析和解決死鎖場景。

死鎖自動收集係統需求分析

當 SQL Server 中某組資源的兩個或多個線程或進程之間存在循環的依賴關係時,但因互相申請被其他進程所占用,而不會釋放的資源處於的一種永久等待狀態,將會發生死鎖。SQL Server服務自動死鎖檢查進程默認每5分鍾跑一次,當死鎖發生時,會選擇一個代價較小的進程做為死鎖犧牲品,以此來避免死鎖導致更大範圍的影響。被選擇做為死鎖犧牲品的進程會報告如下錯誤:

Msg 1205, Level 13, State 51, Line 8
Transaction (Process ID 54) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

如果進程間發生了死鎖,對於用戶業務係統,乃至整個SQL Server服務健康狀況影響很大,輕者係統反應緩慢,服務假死;重者服務掛起,拒絕請求。那麼,我們有沒有一種方法可以完全自動、無人工幹預的方式異步收集SQL Server係統死鎖信息並遠程保留死鎖相關信息呢?這些信息包括但不僅限於:

  • 死鎖發生在哪些進程之間

  • 各個進程執行的語句塊是什麼?死鎖時,各個進程在執行哪條語句?

  • 死鎖的資源是什麼?死鎖發生在哪個數據庫?哪張表?哪個數據頁?哪個索引上?

  • 死鎖發生的具體時間點,包含語句塊開始時間、語句執行時間等

  • 用戶進程使用的登錄用戶是什麼?客戶端驅動是什麼?
    ……
    如此的無人值守的自動死鎖收集係統,就是我們今天要介紹的應用案例分享:利用SQL Server的Event Notification與Service Broker建立自動死鎖信息收集係統。

Service Broker和Event Notification簡介

在死鎖自動收集係統介紹開始之前,先簡要介紹下SQL Server Service Broker和Event Notification技術。

Service Broker簡介

Service Broker是微軟至SQL Server 2005開始集成到數據庫引擎中的消息通訊組件,為 SQL Server提供隊列和可靠的消息傳遞的能力,可以用來構建基於異步消息通訊為基礎的應用程序。Service Broker既可用於單個 SQL Server 實例的應用程序,也可用於在多個實例間進行消息分發工作的應用程序。Service Broker使用TCP/IP端口在實例間交換消息,所包含的功能有助於防止未經授權的網絡訪問,並可以對通過網絡發送的消息進行加密,以此來保證數據安全性。多實例之間使用Service Broker進行異步消息通訊的結構圖如下所示(圖片來自微軟的官方文檔):

01.png

Event Notification簡介

Event Notification的中文名稱叫事件通知,執行事件通知可對各種Transact-SQL數據定義語言(DDL)語句和SQL跟蹤事件做出響應,采取的響應方式是將這些事件的相關信息發送到 Service Broker 服務。事件通知可以用來執行以下操作:

  • 記錄和檢索發生在數據庫上的更改或活動。

  • 執行操作以異步方式而不是同步方式響應事件。

可以將事件通知用作替代DDL 觸發器和SQL跟蹤的編程方法。事件通知的信息媒介是以xml數據類型的信息傳遞給Service Broker服務,它提供了有關事件的發生時間、受影響的數據庫對象、涉及的 Transact-SQL 批處理語句等詳細信息。對於SQL Server死鎖而言,可以使用Event Notification來跟蹤死鎖事件,來獲取DEADLOCK_GRAPH XML信息,然後通過異步消息組件Service Broker發送到遠端的Deadlock Center上的Service Broker隊列,完成死鎖信息收集到死鎖中央服務。

死鎖收集係統架構圖

在介紹完Service Broker和Event Notification以後,我們來看看死鎖手機係統的整體架構圖。在這個係統中,存在兩種類型角色:我們定義為死鎖客戶端(Deadlock Client)和死鎖中央服務(Deadlock Center)。死鎖客戶端發生死鎖後,首先會將Deadlock Graph XML通過Service Broker發送給死鎖中央服務,死鎖中央服務獲取到Service Broker消息以後,解析這個XML就可以拿到客戶端的死鎖相關信息,最後存放到本地日誌表中,供終端客戶查詢和分析使用。最終的死鎖收集係統架構圖如下所示:
02.png

詳細的死鎖信息收集過程介紹如下:死鎖客戶端通過本地SQL Server的Event Notification捕獲發生在該實例上的Deadlock事件,並在死鎖發生以後將Deadlock Graph XML數據存放到Event Notification綁定的隊列中,然後通過綁定在該隊列上的存儲過程自動觸發將Deadlock Graph XML通過Service Broker異步消息通訊的方式發送到死鎖中央服務。中央服務在接收到Service Broker消息以後,首先放入Deadlock Center Service Broker隊列中,該隊列綁定了消息自動處理存儲過程,用來解析Deadlock Graph XML信息,並將死鎖相關的詳細信息存入到Deadlock Center的Log Table中。最後,終端用戶可以直接對Log Table來查詢和分析所有Deadlock Client上發生的死鎖信息。通過這係列的過程,最終達到了死鎖信息的自動遠程存儲、收集,以提供後期死鎖場景還原和複盤,達到死鎖信息可追溯,及時監控,及時發現的目的。

Service Broker配置

係統架構設計完畢後,接下來是係統的配置和搭建過程,首先看看Service Broker的配置。這個配置還是相對比較繁瑣的,包含了以下步驟:

  • 創建Service Broker數據庫(假設數據庫名為DDLCenter)並開啟Service Broker選項

  • 創建Service Broker隊列的激活存儲過程和相關表對象

  • 創建Master數據庫下的Master Key

  • 創建傳輸層本地和遠程證書

  • 創建基於證書的用戶登錄

  • 創建Service Broker端口並授權用戶連接

  • 創建DDLCenter數據庫下的Master Key

  • 創建會話層本地及遠程證書

  • 創建Service Broker組件所需要的對象,包括:Message Type、Contact、Queue、Service、Remote Service Binding、Route

Deadlock Client Server

以下的配置請在Deadlock Client SQL Server實例上操作。

  • 創建DDLCenter數據庫並開啟Service Broker選項
-- Run script on client server to gather deadlock graph xml
USE master
GO
-- Create Database
IF DB_ID('DDLCenter') IS NULL
	CREATE DATABASE [DDLCenter];
GO
-- Change datbase to simple recovery model
ALTER DATABASE [DDLCenter] SET RECOVERY SIMPLE WITH NO_WAIT
GO
-- Enable Service Broker
ALTER DATABASE [DDLCenter] SET ENABLE_BROKER,TRUSTWORTHY ON
GO
-- Change database Owner to sa
ALTER AUTHORIZATION ON DATABASE::DDLCenter TO [sa]
GO
  • 三個表和兩個存儲過程

表[DDLCollector].[Deadlock_Traced_Records]:從Event Notification隊裏接收的消息會記錄到該表中。
表[DDLCollector].[Send_Records]:Deadlock Client成功發送Service Broker消息記錄
表[DDLCollector].[Error_Records]:記錄發生異常情況時的信息。
存儲過程[DDLCollector].[UP_ProcessDeadlockEventMsg]:Deadlock Client綁定到隊裏的激活存儲過程,一旦隊列中有消息進入,這個存儲過程會被自動調用。
存儲過程[DDLCollector].[UP_SendDeadlockMsg]:Deadlock Client發送異步消息給Deadlock Center,這個存儲過程會被上麵的激活存儲過程調用。

-- Run on Client Instance
USE [DDLCenter]
GO
-- Create Schema
IF NOT EXISTS(
	SELECT TOP 1 *
	FROM sys.schemas
	WHERE name = 'DDLCollector'
)
BEGIN
	EXEC('CREATE SCHEMA DDLCollector');
END
GO

-- Create table to log Traced Deadlock Records
IF OBJECT_ID('DDLCollector.Deadlock_Traced_Records', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Deadlock_Traced_Records]
GO

CREATE TABLE [DDLCollector].[Deadlock_Traced_Records](
	[RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
	[Processed_Msg] [xml] NULL,
	[Processed_Msg_CheckSum] INT,
	[Record_Time] [datetime] NOT NULL 
		CONSTRAINT DF_Deadlock_Traced_Records_Record_Time DEFAULT(GETDATE()),
	CONSTRAINT PK_Deadlock_Traced_Records_RowId PRIMARY KEY
	(RowId ASC)
) ON [PRIMARY]
GO

-- Create table to record deadlock graph xml sent successfully log
IF OBJECT_ID('DDLCollector.Send_Records', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Send_Records]
GO

CREATE TABLE [DDLCollector].[Send_Records](
	[RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
	[Send_Msg] [xml] NULL,
	[Send_Msg_CheckSum] INT,
	[Record_Time] [datetime] NOT NULL 
		CONSTRAINT DF_Send_Records_Record_Time DEFAULT(GETDATE()),
	CONSTRAINT PK_Send_Records_RowId PRIMARY KEY
	(RowId ASC)
) ON [PRIMARY]
GO

-- Create table to record error info when exception occurs
IF OBJECT_ID('DDLCollector.Error_Records', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Error_Records]
GO

CREATE TABLE [DDLCollector].[Error_Records](
	[RowId] [int] IDENTITY(1,1) NOT NULL,
	[Msg_Body] [xml] NULL,
	[Conversation_handle] [uniqueidentifier] NULL,
	[Message_Type] SYSNAME NULL,
	[Service_Name] SYSNAME NULL,
	[Contact_Name] SYSNAME NULL,
	[Record_Time] [datetime] NOT NULL
		CONSTRAINT DF_Error_Records_Record_Time DEFAULT(GETDATE()),
	[Error_Details] [nvarchar](4000) NULL,
	CONSTRAINT PK_Error_Records_RowId PRIMARY KEY
	(RowId ASC)
) ON [PRIMARY]
GO


USE [DDLCenter]
GO

-- Create Store Procedure to Send Deadlock Graph xml to Center Server
IF OBJECT_ID('DDLCollector.UP_SendDeadlockMsg', 'P') IS NOT NULL
	DROP PROC [DDLCollector].[UP_SendDeadlockMsg]
GO

CREATE PROCEDURE [DDLCollector].[UP_SendDeadlockMsg](
	@DeadlockMsg XML
)
AS  
BEGIN      
	SET NOCOUNT ON; 

	DECLARE 
		@handle UNIQUEIDENTIFIER
		,@Proc_Name SYSNAME
		,@Error_Details VARCHAR(2000)
	;

	-- get the store procedure name
	SELECT 
        @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID)) 
        + '.' 
        + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
    FROM sys.procedures
    WHERE OBJECT_ID = @@PROCID
	;
	
	BEGIN TRY
		
		-- Begin Dialog
		BEGIN DIALOG CONVERSATION @handle
		FROM SERVICE [https://soa/deadlock/service/ClientService]
		TO Service 'https://soa/deadlock/service/CenterService'
		ON CONTRACT [https://soa/deadlock/contract/CheckContract]
		;

		-- Send deadlock graph xml as the message to Center Server
		SEND ON CONVERSATION @handle
		MESSAGE TYPE [https://soa/deadlock/MsgType/Request] (@DeadlockMsg);

		-- Log it successfully
		INSERT INTO [DDLCollector].[Send_Records]([Send_Msg], [Send_Msg_CheckSum])   
		VALUES( @DeadlockMsg, CHECKSUM(CAST(@DeadlockMsg as NVARCHAR(MAX))))
	END TRY
	BEGIN CATCH
		
		-- Record the error info when exception occurs
		SET   @Error_Details=
				' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
				' Error Message : ' + ERROR_MESSAGE() +
				' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
				' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
				' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) +
				' Exception Proc: ' + @Proc_Name
		;    
        
		-- record into table
		INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])         
		VALUES(@DeadlockMsg, @handle, 'https://soa/deadlock/MsgType/Request', 'https://soa/deadlock/service/ClientService', 'https://soa/deadlock/contract/CheckContract', @Error_Details); 

	END CATCH
END
GO

-- Create Store Procedure for Queue: when extend event notification queue message
-- this store procedure will be called.
IF OBJECT_ID('DDLCollector.UP_ProcessDeadlockEventMsg', 'P') IS NOT NULL
	DROP PROC [DDLCollector].[UP_ProcessDeadlockEventMsg]
GO

CREATE PROCEDURE [DDLCollector].[UP_ProcessDeadlockEventMsg]
AS
/*

SELECT * FROM [DDLCollector].[Deadlock_Traced_Records]
SELECT * FROM [DDLCollector].[Send_Records]

SELECT * FROM [DDLCollector].[Error_Records]

*/
BEGIN      
	SET NOCOUNT ON;   
	DECLARE 
		@handle UNIQUEIDENTIFIER
		, @Message_Type SYSNAME
		, @Service_Name SYSNAME
		, @Contact_Name SYSNAME
		, @Error_Details VARCHAR(2000)
		, @Message_Body XML
		, @Proc_Name SYSNAME
	;

	-- Store Procedure Name
	SELECT 
        @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID)) 
        + '.' 
        + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
    FROM sys.procedures
    WHERE OBJECT_ID = @@PROCID
	;

	BEGIN TRY
    
	-- Receive message from queue
	WAITFOR(RECEIVE TOP(1)        
			@handle = conversation_handle
			, @Message_Type = message_type_name
			, @Service_Name = service_name
			, @Contact_Name = service_contract_name
			, @Message_Body = message_body        
			FROM dbo.[https://soa/deadlock/queue/ClientQueue]),Timeout 500
	;
	
	-- just return if there is no message needed to process        
	IF(@@Rowcount=0)      
		BEGIN   
			RETURN        
		END
	-- Get data from message queue
	ELSE IF @Message_Type = 'https://schemas.microsoft.com/SQL/Notifications/EventNotification'      
		BEGIN               
			-- Record message log first
			INSERT INTO  [DDLCollector].[Deadlock_Traced_Records](Processed_Msg, [Processed_Msg_CheckSum])         
			VALUES(@Message_Body, CHECKSUM(CAST(@Message_Body as NVARCHAR(MAX))))
		
			-- BE NOTED HERE: PLEASE DO'T END CONVERSATION, OR ELSE EXCEPTION WILL BE THROWN OUTPUT
			/*
			Error: 17001, Severity: 16, State: 1.
			Failure to send an event notification instance of type 'DEADLOCK_GRAPH' on conversation handle '{67419386-7C34-E711-A709-001C42099969}'. Error Code = '8429'.
			Error: 17005, Severity: 16, State: 1.
			Event notification 'DeadLockNotificationEvent' in database 'master' dropped due to send time service broker errors. Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active.  
			*/
			--END CONVERSATION @handle

			--Here call another Store Procedure to send deadlock graph info to center server
			EXEC [DDLCollector].[UP_SendDeadlockMsg] @Message_Body;
		END
	--End Diaglog Message Type, that means we should end this conversation      
	ELSE IF @Message_Type = N'https://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'        
		BEGIN         
			END CONVERSATION @handle;     
		END
	-- Konwn Service Broker Errors by System.   
	ELSE IF @Message_Type = N'https://schemas.microsoft.com/SQL/ServiceBroker/Error'        
		BEGIN         
			END CONVERSATION @handle       
		
			INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])         
			VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Exception Store Procedure: ' + @Proc_Name);               
		END       
	ELSE
		-- unknown Message Types.        
		BEGIN         
			END CONVERSATION @handle

			INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])          
			VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Received unexpected message type when executing Store Procedure: ' + @Proc_Name);

			-- unexpected message type         
			RAISERROR (N' Received unknown message type: %s', 16, 1, @Message_Type) WITH LOG;        
		END      
	END TRY      
	BEGIN CATCH       
	BEGIN        
		SET   @Error_Details=
				' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
				' Error Details : ' + ERROR_MESSAGE() +
				' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
				' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
				' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) + 
				' Exception Proc: ' + @Proc_Name
		;    
        
		INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])         
		VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, @Error_Details); 
	END      
	END CATCH  
END  
GO
  • 創建Master庫下Master Key
USE master
GO
-- If the master key is not available, create it. 
IF NOT EXISTS (SELECT * 
				FROM sys.symmetric_keys
				WHERE name LIKE '%MS_DatabaseMasterKey%') 
BEGIN
	CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'ClientMasterKey*'; 
END 
GO
  • 創建傳輸層本地證書並備份到本地文件係統

這裏請注意證書的開始生效時間要略微早於當前時間,並設置合適的證書過期日期,我這裏是設置的過期日期為9999年12月30號。

USE master
GO
-- Crete Transport Layer Certification
CREATE CERTIFICATE TrpCert_ClientLocal
AUTHORIZATION dbo
WITH SUBJECT = 'TrpCert_ClientLocal',
START_DATE = '05/07/2017',
EXPIRY_DATE = '12/30/9999'
GO

-- then backup it up to local path
-- and after that copy it to Center server
BACKUP CERTIFICATE TrpCert_ClientLocal
TO FILE = 'C:\Temp\TrpCert_ClientLocal.cer';
GO
  • 創建傳輸層遠程證書

這裏的證書是通過證書文件來創建的,這個證書文件來自於遠程通訊的另一端Deadlock Center SQL Server的證書文件的一份拷貝。

USE master
GO
-- Create certification came from Center Server.
CREATE	CERTIFICATE TrpCert_RemoteCenter 
FROM FILE = 'C:\Temp\TrpCert_RemoteCenter.cer'
GO
  • 創建基於證書文件的用戶登錄

這裏也可以創建帶密碼的常規用戶登錄,但是為了規避安全風險,這裏最好創建基於證書文件的用戶登錄。

USE master
GO
-- Create user login
IF NOT EXISTS(SELECT * 
			FROM sys.syslogins 
			WHERE name='SSBDbo')
BEGIN
	CREATE LOGIN SSBDbo FROM CERTIFICATE TrpCert_ClientLocal;
END
GO
  • 創建Service Broker TCP/IP通訊端口並授權用戶連接權限

這裏需要注意的是,端口授權的證書一定本地實例創建的證書,而不是來自於遠程服務器的那個證書。比如代碼中的AUTHENTICATION = CERTIFICATE TrpCert_ClientLocal部分。

USE master
GO 		 
--Creaet Tcp endpoint for SSB comunication and grant connect to users. 	 
CREATE ENDPOINT EP_SSB_ClientLocal
STATE = STARTED 
AS TCP 
( 
 	LISTENER_PORT = 4022 
) 
FOR SERVICE_BROKER (AUTHENTICATION = CERTIFICATE TrpCert_ClientLocal,  ENCRYPTION = REQUIRED 
) 
GO 

-- Grant Connect on Endpoint to User SSBDbo
GRANT CONNECT ON ENDPOINT::EP_SSB_ClientLocal TO SSBDbo 
GO
  • 創建DDLCenter數據庫Master Key
-- Now, let's go inside to conversation database
USE DDLCenter
GO

-- Create Master Key
IF NOT EXISTS (SELECT * 
				FROM sys.symmetric_keys 
				WHERE name LIKE '%MS_DatabaseMasterKey%')
BEGIN		
	CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'DDLCenterMasterKey*';
END
GO
  • 創建會話層本地證書
USE DDLCenter
GO
-- Create conversation layer certification
CREATE CERTIFICATE DlgCert_ClientLocal
AUTHORIZATION dbo
WITH SUBJECT = 'DlgCert_ClientLocal',
START_DATE = '05/07/2017',
EXPIRY_DATE = '12/30/9999'
GO

-- backup it up to local path
-- and then copy it to remote Center server
BACKUP CERTIFICATE DlgCert_ClientLocal
TO FILE = 'C:\Temp\DlgCert_ClientLocal.cer';
GO
  • 創建DDLCenter用戶,不需要和任何用戶登錄匹配
USE DDLCenter
GO
-- Create User for login under conversation database
IF NOT EXISTS(
	SELECT TOP 1 *
	FROM sys.database_principals
	WHERE name = 'SSBDbo'
)
BEGIN
	CREATE USER SSBDbo WITHOUT LOGIN;
END
GO
  • 創建會話層遠程證書,這個證書文件來自Deadlock Center SQL Server備份
USE DDLCenter
GO
-- Create converstaion layer certification came from remote Center server.
CREATE	CERTIFICATE DlgCert_RemoteCenter
AUTHORIZATION SSBDbo
FROM FILE='C:\Temp\DlgCert_RemoteCenter.cer'
GO

GRANT CONNECT TO SSBDbo;
  • 創建Service Broker組件對象

Deadlock Client與Deadlock Center在創建Service Broker組件對象時存在差異:第一個差異是創建Service的時候,需要包含Event Notification的Contract,名稱為
https://schemas.microsoft.com/SQL/Notifications/PostEventNotification;第二個差異是需要多創建一個指向本地服務的路由https://soa/deadlock/route/LocalRoute。

USE DDLCenter
GO

-- Create Message Type
CREATE MESSAGE TYPE [https://soa/deadlock/MsgType/Request]
       VALIDATION = WELL_FORMED_XML;
CREATE MESSAGE TYPE [https://soa/deadlock/MsgType/Response]
       VALIDATION = WELL_FORMED_XML;
GO

-- Create Contact
CREATE CONTRACT [https://soa/deadlock/contract/CheckContract](
	[https://soa/deadlock/MsgType/Request] SENT BY INITIATOR,
	[https://soa/deadlock/MsgType/Response] SENT BY TARGET
);
GO

-- Create Queue
CREATE QUEUE dbo.[https://soa/deadlock/queue/ClientQueue] 
WITH STATUS = ON, RETENTION = OFF
, ACTIVATION (STATUS = ON , 
				PROCEDURE_NAME = [DDLCollector].[UP_ProcessDeadlockEventMsg] , 
				MAX_QUEUE_READERS = 2 , 
				EXECUTE AS N'dbo') 
GO

-- Create Service
-- Here is very import, we have to create service for both contacts
-- to get extend event notification and SSB work.
CREATE SERVICE [https://soa/deadlock/service/ClientService]
ON QUEUE [https://soa/deadlock/queue/ClientQueue]
(
  [https://soa/deadlock/contract/CheckContract],
  [https://schemas.microsoft.com/SQL/Notifications/PostEventNotification]
);
GO

-- Grant Send on service
GRANT SEND ON SERVICE::[https://soa/deadlock/service/ClientService] to SSBDbo;
GO

-- Create Remote Service Bingding
CREATE REMOTE SERVICE BINDING [https://soa/deadlock/RSB/CenterRSB]
TO SERVICE 'https://soa/deadlock/service/CenterService' 
WITH  USER = [SSBDbo],
ANONYMOUS=Off
GO

-- Create Route
CREATE ROUTE [https://soa/deadlock/route/CenterRoute]
WITH SERVICE_NAME = 'https://soa/deadlock/service/CenterService',
ADDRESS = 'TCP://10.211.55.3:4024';
GO

-- Create route for the DeadlockNotificationSvc
CREATE ROUTE [https://soa/deadlock/route/LocalRoute]
WITH SERVICE_NAME = 'https://soa/deadlock/service/ClientService',
ADDRESS = 'LOCAL';
GO

Deadlock Center Server

  • 創建DDLCenter數據庫並開啟Service Broker選項
-- Run script on center server to receive client deadlock xml
USE master
GO
-- Create Database
IF DB_ID('DDLCenter') IS NULL
	CREATE DATABASE [DDLCenter];
GO
-- Change datbase to simple recovery model
ALTER DATABASE [DDLCenter] SET RECOVERY SIMPLE WITH NO_WAIT
GO
-- Enable Service Broker
ALTER DATABASE [DDLCenter] SET ENABLE_BROKER,TRUSTWORTHY ON
GO
-- Change database Owner to sa
ALTER AUTHORIZATION ON DATABASE::DDLCenter TO [sa]
GO
  • 三張表和兩個存儲過程

表[DDLCollector].[Collect_Records]:Deadlock Center成功接收到的Service Broker消息。
表[DDLCollector].[Error_Records]:記錄發生異常情況的詳細信息。
表[DDLCollector].[Deadlock_Info]:記錄所有Deadlock Client端發生的Deadlock詳細信息。
存儲過程[DDLCollector].[UP_ProcessDeadlockGraphEventMsg]:Deadlock Center上綁定到隊列的激活存儲過程,一旦隊列中有消息進入,這個存儲過程會被自動調用。
存儲過程[DDLCollector].[UP_ParseDeadlockGraphEventMsg]:Deadlock Center上解析Deadlock Graph XML的存儲過程對象,這個存儲過程會被上麵的激活存儲過程調用來解析XML,然後放入表[DDLCollector].[Deadlock_Info]中。

USE [DDLCenter]
GO

-- Create Schema
IF NOT EXISTS(
	SELECT TOP 1 *
	FROM sys.schemas
	WHERE name = 'DDLCollector'
)
BEGIN
	EXEC('CREATE SCHEMA DDLCollector');
END
GO

-- Create table to log the received message
IF OBJECT_ID('DDLCollector.Collect_Records', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Collect_Records]
GO

CREATE TABLE [DDLCollector].[Collect_Records](
	[RowId] [BIGINT] IDENTITY(1,1) NOT NULL,
	[Deadlock_Graph_Msg] [xml] NULL,
	[Deadlock_Graph_Msg_CheckSum] INT,
	[Record_Time] [datetime] NOT NULL 
		CONSTRAINT DF_Collect_Records_Record_Time DEFAULT(GETDATE()),
	CONSTRAINT PK_Collect_Records_RowId PRIMARY KEY
	(RowId ASC)
) ON [PRIMARY]
GO

-- create table to record the exception when error occurs
IF OBJECT_ID('DDLCollector.Error_Records', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Error_Records]
GO

CREATE TABLE [DDLCollector].[Error_Records](
	[RowId] [int] IDENTITY(1,1) NOT NULL,
	[Msg_Body] [xml] NULL,
	[Conversation_handle] [uniqueidentifier] NULL,
	[Message_Type] SYSNAME NULL,
	[Service_Name] SYSNAME NULL,
	[Contact_Name] SYSNAME NULL,
	[Record_Time] [datetime] NOT NULL
		CONSTRAINT DF_Error_Records_Record_Time DEFAULT(GETDATE()),
	[Error_Details] [nvarchar](4000) NULL,
	CONSTRAINT PK_Error_Records_RowId PRIMARY KEY
	(RowId ASC)
) ON [PRIMARY]
GO

-- create business table to record deadlock analysised info
IF OBJECT_ID('DDLCollector.Deadlock_Info', 'U') IS NOT NULL
	DROP TABLE [DDLCollector].[Deadlock_Info]
GO
CREATE TABLE [DDLCollector].[Deadlock_Info](
	RowId INT IDENTITY(1,1) NOT NULL
	,SQLInstance sysname NULL
	,SPid INT NULL
	,is_Vitim BIT NULL
	,DeadlockGraph XML NULL
	,DeadlockGraphCheckSum INT NULL
	,lasttranstarted DATETIME NULL
	,lastbatchstarted DATETIME NULL
	,lastbatchcompleted DATETIME NULL
	,procname SYSNAME NULL 
	,Code NVARCHAR(max) NULL
	,LockMode sysname NULL
	,Indexname sysname NULL
	,KeylockObject sysname NULL
	,IndexLockMode sysname NULL
	,Inputbuf NVARCHAR(max) NULL
	,LoginName sysname NULL
	,Clientapp sysname NULL
	,Action varchar(1000) NULL
	,status varchar(10) NULL
	,[Record_Time] [datetime] NOT NULL
		CONSTRAINT DF_Deadlock_Info_Record_Time DEFAULT(GETDATE()),
	CONSTRAINT PK_Deadlock_Info_RowId PRIMARY KEY
	(RowId ASC)
)
GO



USE [DDLCenter]
GO

-- Create store procedure to analysis deadlock graph xml
-- and log into business table
IF OBJECT_ID('DDLCollector.UP_ParseDeadlockGraphEventMsg', 'P') IS NOT NULL
	DROP PROC [DDLCollector].[UP_ParseDeadlockGraphEventMsg]
GO

CREATE PROCEDURE [DDLCollector].[UP_ParseDeadlockGraphEventMsg](
	@DeadlockGraph_Msg XML
)
AS  
BEGIN      
	SET NOCOUNT ON; 

	;WITH deadlock
	AS
	(
		SELECT
			OwnerID = T.C.value('@id', 'varchar(50)')
			,SPid = T.C.value('(./@spid)[1]','int')
			,status = T.C.value('(./@status)[1]','varchar(10)')
			,Victim = case 
						when T.C.value('@id', 'varchar(50)') = T.C.value('./../../@victim','varchar(50)') then 1 
						else 0 end
			,LockMode = T.C.value('@lockMode', 'sysname')
			,Inputbuf = T.C.value('(./inputbuf/text())[1]','nvarchar(max)')
			,Code = T.C.value('(./executionStack/frame/text())[1]','nvarchar(max)')
			,SPName = T.C.value('(./executionStack/frame/@procname)[1]','sysname')
			,Hostname = T.C.value('(./@hostname)[1]','sysname')
			,Clientapp = T.C.value('(./@clientapp)[1]','varchar(1000)')
			,lasttranstarted = T.C.value('(./@lasttranstarted)[1]','datetime')
			,lastbatchstarted = T.C.value('(./@lastbatchstarted)[1]','datetime')
			,lastbatchcompleted = T.C.value('(./@lastbatchcompleted)[1]','datetime')
			,LoginName = T.C.value('@loginname', 'sysname')
			,Action = T.C.value('(./@transactionname)[1]','varchar(1000)')
		FROM @DeadlockGraph_Msg.nodes('EVENT_INSTANCE/TextData/deadlock-list/deadlock/process-list/process') AS T(C)
	)
	,
	keylock
	AS
	(
		SELECT
			OwnerID = T.C.value('./owner[1]/@id', 'varchar(50)')
			,KeylockObject = T.C.value('./../@objectname', 'sysname')
			,Indexname = T.C.value('./../@indexname', 'sysname')
			,IndexLockMode = T.C.value('./../@mode', 'sysname')
		FROM @DeadlockGraph_Msg.nodes('EVENT_INSTANCE/TextData/deadlock-list/deadlock/resource-list/keylock/owner-list') AS T(C)
	)
	SELECT
		SQLInstance = A.Hostname 
		,A.SPid
		,is_Vitim = A.Victim
		,DeadlockGraph = @DeadlockGraph_Msg.query('EVENT_INSTANCE/TextData/deadlock-list')
		,DeadlockGraphCheckSum = CHECKSUM(CAST(@DeadlockGraph_Msg AS NVARCHAR(MAX)))
		,A.lasttranstarted
		,A.lastbatchstarted
		,A.lastbatchcompleted
		,A.SPName
		,A.Code
		,A.LockMode
		,B.Indexname
		,B.KeylockObject
		,B.IndexLockMode
		,A.Inputbuf
		,A.LoginName
		,A.Clientapp
		,A.Action
		,status
		,[Record_Time] = GETDATE()
	FROM deadlock AS A
			LEFT JOIN keylock AS B
			ON A.OwnerID = B.OwnerID
	ORDER BY A.SPid, A.Victim
	;
END
GO

-- Create store Procedure for Center server service queue to process deadlock xml
-- when message sending from client server.
IF OBJECT_ID('DDLCollector.UP_ProcessDeadlockGraphEventMsg', 'P') IS NOT NULL
	DROP PROC [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]
GO

CREATE PROCEDURE [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]
AS
/*
EXEC [DDLCollector].[UP_ProcessDeadlockGraphEventMsg]

SELECT * FROM [DDLCollector].[Collect_Records]

SELECT * FROM [DDLCollector].[Error_Records]

SELECT * FROM [DDLCollector].[Deadlock_Info]
*/
BEGIN      
	SET NOCOUNT ON;   
	DECLARE 
		@handle UNIQUEIDENTIFIER
		, @Message_Type SYSNAME
		, @Service_Name SYSNAME
		, @Contact_Name SYSNAME
		, @Error_Details VARCHAR(2000)
		, @Message_Body XML
		, @Proc_Name SYSNAME
	;

	-- Store Procedure name
	SELECT 
        @Proc_Name = ISNULL(QUOTENAME(SCHEMA_NAME(SCHEMA_ID)) 
        + '.' 
        + QUOTENAME(OBJECT_NAME(@@PROCID)),'')
    FROM sys.procedures
    WHERE OBJECT_ID = @@PROCID
	;

	BEGIN TRY
        
	-- Receive deadlock message from service queue
	WAITFOR(RECEIVE TOP(1)        
			@handle = conversation_handle
			, @Message_Type = message_type_name
			, @Service_Name = service_name
			, @Contact_Name = service_contract_name
			, @Message_Body = message_body        
			FROM dbo.[https://soa/deadlock/queue/CenterQueue]),Timeout 500
	;
	        
	IF(@@Rowcount=0)      
		BEGIN   
			RETURN        
		END
	-- Message type is the very correct one
	ELSE IF @Message_Type = N'https://soa/deadlock/MsgType/Request'        
		BEGIN               
			-- Record message log first
			INSERT INTO  [DDLCollector].[Collect_Records](Deadlock_Graph_Msg, [Deadlock_Graph_Msg_CheckSum])          
			VALUES(@Message_Body, CHECKSUM(cast(@Message_Body as NVARCHAR(MAX))))
		
			END CONVERSATION @handle

			--Here call another Store Procedure to process our message to record deadlock relation info
			INSERT INTO [DDLCollector].[Deadlock_Info]
			EXEC [DDLCollector].[UP_ParseDeadlockGraphEventMsg] @Message_Body;
		END
	--End Diaglog Message Type, that means we should end this conversation      
	ELSE IF @Message_Type = N'https://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'        
		BEGIN         
			END CONVERSATION @handle;        
		END
	-- Konwn Service Broker Errors by System.   
	ELSE IF @Message_Type = N'https://schemas.microsoft.com/SQL/ServiceBroker/Error'        
		BEGIN         
			END CONVERSATION @handle       
		
			INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])          
			VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Exception Store Procedure: ' + @Proc_Name);               
		END       
	ELSE
		-- unknown Message Types.        
		BEGIN         
			END CONVERSATION @handle

			INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])          
			VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, ' Received unexpected message type when executing Store Procedure: ' + @Proc_Name);

			-- unexpected message type         
			RAISERROR (N' Received unexpected message type: %s', 16, 1, @Message_Type) WITH LOG;        
		END      
	END TRY      
	BEGIN CATCH       
	BEGIN
		-- record exception record       
		SET   @Error_Details=
				' Error Number: ' + CAST(ERROR_NUMBER() AS VARCHAR(10)) +
				' Error Message : ' + ERROR_MESSAGE() +
				' Error Severity: ' + CAST(ERROR_SEVERITY() AS VARCHAR(10)) +
				' Error State: ' + CAST(ERROR_STATE() AS VARCHAR(10)) +
				' Error Line: ' + CAST(ERROR_LINE() AS VARCHAR(10)) + 
				' Exception Proc: ' + @Proc_Name
		;    
        
		INSERT INTO [DDLCollector].[Error_Records]([Msg_Body], [Conversation_handle], [Message_Type], [Service_Name], [Contact_Name], [Error_Details])          
		VALUES(@Message_Body, @handle, @Message_Type, @Service_Name, @Contact_Name, @Error_Details); 
	END      
	END CATCH  
END  
GO
  • 創建Master庫下Master Key
USE master
GO
-- If the master key is not available, create it. 
IF NOT EXISTS (SELECT * 
				FROM sys.symmetric_keys
				WHERE name LIKE '%MS_DatabaseMasterKey%') 
BEGIN
	CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'CenterMasterKey*'; 
END 
GO 
  • 創建傳輸層本地證書並備份到本地文件係統
USE master
GO
-- Crete Transport Layer Certification
CREATE CERTIFICATE TrpCert_RemoteCenter
AUTHORIZATION dbo
WITH SUBJECT = 'TrpCert_RemoteCenter',
START_DATE = '05/07/2017',
EXPIRY_DATE = '12/30/9999'
GO

-- then backup it up to local path
-- and after that copy it to Client server
BACKUP CERTIFICATE TrpCert_RemoteCenter
TO FILE = 'C:\Temp\TrpCert_RemoteCenter.cer';
GO
  • 創建傳輸層遠程證書,這個證書文件來至於Deadlock Client SQL Server
USE master
GO
-- Create certification came from client Server.
CREATE	CERTIFICATE TrpCert_ClientLocal 
FROM FILE = 'C:\Temp\TrpCert_ClientLocal.cer'
GO
  • 創建基於證書文件的用戶登錄
USE master
GO
-- Create user login
IF NOT EXISTS(SELECT * 
			FROM sys.syslogins 
			WHERE name='SSBDbo')
BEGIN
	CREATE LOGIN SSBDbo FROM CERTIFICATE TrpCert_RemoteCenter;
END
GO
  • 創建Service Broker TCP/IP通訊端口並授權用戶連接權限
USE master
GO
-- Creaet Tcp endpoint for SSB comunication and grant connect to users. 	 
CREATE ENDPOINT EP_SSB_RemoteCenter
STATE = STARTED 
AS TCP 
( 
 	LISTENER_PORT = 4024
) 
FOR SERVICE_BROKER (AUTHENTICATION = CERTIFICATE TrpCert_RemoteCenter,  ENCRYPTION = REQUIRED 
) 
GO 

-- Grant Connect on Endpoint to User SSBDbo
GRANT CONNECT ON ENDPOINT::EP_SSB_RemoteCenter TO SSBDbo 
GO
  • 創建DDLCenter數據庫Master Key
-- Now, let's go inside to conversation database
USE DDLCenter
GO

-- Create Master Key
IF NOT EXISTS (SELECT * 
				FROM sys.symmetric_keys 
				WHERE name LIKE '%MS_DatabaseMasterKey%')
BEGIN		
	CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'DDLCenterMasterKey*';
END
GO
  • 創建會話層本地證書
USE DDLCenter
GO
-- Create conversation layer certification
CREATE CERTIFICATE DlgCert_RemoteCenter
AUTHORIZATION dbo
WITH SUBJECT = 'DlgCert_RemoteCenter',
START_DATE = '05/07/2017',
EXPIRY_DATE = '12/30/9999'
GO

-- backup it up to local path
-- and then copy it to remote client server
BACKUP CERTIFICATE DlgCert_RemoteCenter
TO FILE = 'C:\Temp\DlgCert_RemoteCenter.cer';
GO
  • 創建DDLCenter用戶,不需要和任何用戶登錄匹配
USE DDLCenter
GO
-- Create User for login under conversation database
IF NOT EXISTS(
	SELECT TOP 1 *
	FROM sys.database_principals
	WHERE name = 'SSBDbo'
)
BEGIN
	--CREATE USER SSBDbo FOR LOGIN SSBDbo;
	CREATE USER SSBDbo WITHOUT LOGIN;
END
GO
  • 創建會話層遠程證書,這個證書文件來自Deadlock Center SQL Server備份
USE DDLCenter
GO
-- Create converstaion layer certification came from remote client server.
CREATE	CERTIFICATE DlgCert_ClientLocal
AUTHORIZATION SSBDbo
FROM FILE='C:\Temp\DlgCert_ClientLocal.cer'
GO

GRANT CONNECT TO SSBDbo;
  • 創建Service Broker組件對象
USE DDLCenter
GO

-- Create Message Type
CREATE MESSAGE TYPE [https://soa/deadlock/MsgType/Request]
       VALIDATION = WELL_FORMED_XML;
CREATE MESSAGE TYPE [https://soa/deadlock/MsgType/Response]
       VALIDATION = WELL_FORMED_XML;
GO

-- Create Contact
CREATE CONTRACT [https://soa/deadlock/contract/CheckContract](
	[https://soa/deadlock/MsgType/Request] SENT BY INITIATOR,
	[https://soa/deadlock/MsgType/Response] SENT BY TARGET
);
GO

-- Create Queue
CREATE QUEUE [dbo].[https://soa/deadlock/queue/CenterQueue] 
WITH STATUS = ON , RETENTION = OFF
, ACTIVATION (STATUS = ON , 
				PROCEDURE_NAME = [DDLCollector].[UP_ProcessDeadlockGraphEventMsg] , 
				MAX_QUEUE_READERS = 3 , 
				EXECUTE AS N'dbo') 
GO

-- Create Service
CREATE SERVICE [https://soa/deadlock/service/CenterService]
ON QUEUE [https://soa/deadlock/queue/CenterQueue]
(
  [https://soa/deadlock/contract/CheckContract]
);
GO

-- Grant Send on service to User SSBDbo
GRANT SEND ON SERVICE::[https://soa/deadlock/service/CenterService] to SSBDbo;
GO

-- Create Remote Service Bingding
CREATE REMOTE SERVICE BINDING [https://soa/deadlock/RSB/ClientRSB]
TO SERVICE 'https://soa/deadlock/service/ClientService' 
WITH  USER = SSBDbo,
ANONYMOUS=Off
GO

-- Create Route
CREATE ROUTE [https://soa/deadlock/route/ClientRoute]
WITH SERVICE_NAME = 'https://soa/deadlock/service/ClientService',
ADDRESS = 'TCP://10.211.55.3:4022';
GO

Event Notification配置

Event Notification隻需要在Deadlock Client Server創建即可,因為隻需要在Deadlock Client上跟蹤死鎖事件。在為Deadlock Client 配置Service Broker章節,我們已經為Event Notification創建了隊列、服務和路由。因此,在這裏我們隻需要創建Event Notification對象即可。方法參見如下的代碼:

USE DDLCenter
GO

-- Create Event Notification for the deadlock_graph event.
IF EXISTS(
	SELECT * FROM sys.server_event_notifications  
	WHERE name = 'DeadLockNotificationEvent'
)
BEGIN
	DROP EVENT NOTIFICATION DeadLockNotificationEvent
	ON SERVER;
END
GO


CREATE EVENT NOTIFICATION DeadLockNotificationEvent
ON SERVER
WITH FAN_IN
FOR DEADLOCK_GRAPH
TO SERVICE 
'https://soa/deadlock/service/ClientService', 
'current database'
GO

模擬死鎖

至此為止,所有對象和準備工作已經準備完成,萬事俱備隻欠東風,讓我們在Deadlock Client實例上模擬死鎖場景。首先,我們在Test數據庫下創建兩個測試表,表名分別為:dbo.test_deadlock1和dbo.test_deadlock2,代碼如下:

IF DB_ID('Test') IS NULL
	CREATE DATABASE Test;
GO

USE Test
GO

-- create two test tables
IF OBJECT_ID('dbo.test_deadlock1','u') IS NOT NULL
    DROP TABLE dbo.test_deadlock1
GO

CREATE TABLE dbo.test_deadlock1(
id INT IDENTITY(1,1) not null PRIMARY KEY
,name VARCHAR(20) null
);

IF OBJECT_ID('dbo.test_deadlock2','u') IS NOT NULL
    DROP TABLE dbo.test_deadlock2
GO

CREATE TABLE dbo.test_deadlock2(
id INT IDENTITY(1,1) not null PRIMARY KEY
,name VARCHAR(20) null
);

INSERT INTO dbo.test_deadlock1
SELECT 'AA'
UNION ALL
SELECT 'BB';


INSERT INTO dbo.test_deadlock2
SELECT 'AA'
UNION ALL
SELECT 'BB';
GO

接下來,我們使用SSMS打開一個新的連接,我們假設叫session 1,執行如下語句:

--session 1
USE Test
GO

BEGIN TRAN 
UPDATE dbo.test_deadlock1
SET name = 'CC'
WHERE id = 1
;
WAITFOR DELAY '00:00:05'

UPDATE dbo.test_deadlock2
SET name = 'CC'
WHERE id = 1
;
ROLLBACK

緊接著,我們使用SSMS打開第二個連接,假設叫Session 2,執行下麵的語句:

--session 2
USE Test
GO

BEGIN TRAN 
UPDATE dbo.test_deadlock2
SET name = 'CC'
WHERE id = 1
;

UPDATE dbo.test_deadlock1
SET name = 'CC'
WHERE id = 1
;
COMMIT

等待一會兒功夫以後,死鎖發生,並且Session 2做為了死鎖的犧牲品,我們會在Session 2的SSMS信息窗口中看到如下的死鎖信息:

Msg 1205, Level 13, State 51, Line 8
Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

用戶查詢死鎖信息

根據上麵的模擬死鎖小節,說明死鎖已經真真切切的發生了,那麼,死鎖信息到底有沒有被捕獲到呢?如果終端用戶想要查看和分析所有客戶端的死鎖信息,隻需要連接Deadlock Center SQL Server,執行下麵的語句:

-- Run on Deadlock Center Server
USE DDLCenter
GO

SELECT * FROM [DDLCollector].[Deadlock_Info]

由於結果集寬度太寬,人為將查詢結果分兩段截圖,第一段結果集展示如下:
03.png

第二段結果集截圖如下:
04.png

從這個結果集,我們可以清楚的看到Deadlock Client發生死鎖的詳細信息,包含:

  • 死鎖發生的Deadlock Client實例名稱:CHERISH-PC

  • 被死鎖進程號60,死鎖進程57號

  • 死鎖相關進程的事務開始時間,最後一個Batch開始執行時間和完成時間

  • 死鎖進程執行的代碼和Batch語句

  • 死鎖發生時鎖的類型

  • 表和索引名稱

  • 死鎖相關進程的登錄用戶

……
等等。

踩過的坑

當Deadlock Client 上SQL Server發生兩次或者兩次以上的Deadlock事件以後,自建的Event Notification對象(名為:DeadLockNotificationEvent)會被SQL Server係統自動刪除,從而導致整個死鎖收集係統無法工作。

表象

SQL Server在錯誤日誌中會拋出如下4個錯誤信息:兩個錯誤編號為17004,一個編號為17001的錯誤,最後是一個編號為17005錯誤,其中17005明確說明了,Event Notification對象被刪除了。如下:

Error: 17004, Severity: 16, State: 1.
Event notification conversation on dialog handle '{4A6A0FBD-7A34-E711-A709-001C42099969}' closed without an error.
Error: 17004, Severity: 16, State: 1.
Event notification conversation on dialog handle '{476A0FBD-7A34-E711-A709-001C42099969}' closed without an error.
Error: 17001, Severity: 16, State: 1.
Failure to send an event notification instance of type 'DEADLOCK_GRAPH' on conversation handle '{F711A404-7934-E711-A709-001C42099969}'. Error Code = '8429'.
Error: 17005, Severity: 16, State: 1.
Event notification 'DeadLockNotificationEvent' in database 'master' dropped due to send time service broker errors. Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active.

錯誤日誌截圖如下:
05.png

問題分析

從錯誤提示信息due to send time service broker errors來看,最開始花了很長時間來排查Service Broker方麵的問題,在長達數小時的問題排查無果後,靜下心來仔細想想:如果是Service Broker有問題的話,我們不可能完成第一、第二條死鎖信息的收集,所以問題應該與Service Broker沒有直接關係。於是,注意到了錯誤提示信息的後半部分Check to ensure the conversation handle, service broker contract, and service specified in the event notification are active,再次以可以成功收集兩條deadlock錯誤信息為由,排除Contact和Service的問題可能性,所以最有可能出問題的地方猜測應該是conversation handle,繼續排查與conversation handle相關操作的地方,發現存儲過程[DDLCollector].[UP_ProcessDeadlockEventMsg]的中的代碼:

...
ELSE IF @Message_Type = 'https://schemas.microsoft.com/SQL/Notifications/EventNotification'      
		BEGIN               
			-- Record message log first
			INSERT INTO  [DDLCollector].[Deadlock_Traced_Records](Processed_Msg, [Processed_Msg_CheckSum])         
			VALUES(@Message_Body, CHECKSUM(CAST(@Message_Body as NVARCHAR(MAX))))
		
			END CONVERSATION @handle

			--Here call another Store Procedure to send deadlock graph info to center server
			EXEC [DDLCollector].[UP_SendDeadlockMsg] @Message_Body;
		END
...

這個邏輯分支不應該有End Conversation的操作,因為這裏是與Event Notification相關的Message Type操作,而不是Service Broker相關的Message Type操作。

解決問題

問題分析清楚了,解決方法就非常簡單了,注釋掉這條語句END CONVERSATION @handle後,重新創建存儲過程。再多次模擬死鎖操作,再也沒有出現Event Notification被係統自動刪除的情況了,說明這個問題已經被徹底解決,坑已經被填上了。
解決問題的代碼修改和注釋如下截圖,以此紀念下踩過的這個坑:
06.png

福利發放

以下是關於SQL Server死鎖相關的係列文章,可以幫助我們全麵了解、分析和解決死鎖問題,其中第一個是這篇文章的視頻演示。

最後總結

這篇文章是一個完整的SQL Server死鎖收集係統典型案例介紹,你甚至可以很輕鬆簡單的將這個方案應用到你的產品環境,來收集產品環境所有SQL Server實例發生死鎖的詳細信息,並根據該係統收集到的場景來改進和改善死鎖發生的概率,從而降低死應用發生異常錯誤的可能性。因此這篇文章有著非常重要的現實價值和意義。

最後更新:2017-05-21 09:01:50

  上一篇:go  PostgreSQL · 實現分析 · PostgreSQL 10.0 並行查詢和外部表的結合
  下一篇:go  HybridDB · 最佳實踐 · HybridDB 數據合並的方法與原理