Divide and Conquer Transactional Replication using Tracer Tokens
As we know, Transactional Replication moves data from the Publisher to the Distributor then to the Subscribers. When working, data flows near real-time through both of these steps. When problems occur which interrupt this flow, latency is introduced. The challenge is to find which step is contributing to the overall latency. Tracer Tokens help you determine of the latency problem is 1) Publisher to Distributor, or 2) Distributor to Subscriber.
Trace Tokens
Available from Replication Monitor or via TSQL statements, Tracer Tokens are special timestamp transactions written to the Publisher’s Transaction Log and picked up by the Log Reader. They are then read by the Distribution Agent and written to the Subscriber. Timestamps for each step are recorded in tracking tables in the Distribution Database and can be displayed in Replication Monitor or via TSQL statements. Lets take a look.
From SQL Server Management Studio, right-click Replication, then Launch Replication Monitor
In Replication Monitor select the Transactional Replication publication, then “Tracer Tokens”. To insert new Token click “Insert Tracer”. When Tracer Token is inserted Replication Monitor indicates “Pending…”
When Log Reader picks up Token it records time in MStracer_tokens table in the Distribution database. The Distribution Agent then picks up the Token and records Subscriber(s) write time in the MStracer_history tables also in the Distribution database.
Examples Tracer Token Readings
In this example latency is running about 6 seconds. Writes to the subscribers are 1 to 2 seconds however its taking 5 seconds for the Tracer Token to move from the Publisher to the Distributor. Focus of the Transactional Latency clearly should start with the Log Reader and not with the Distributors.
In this example it took 3 seconds for the Token to move to the Distributor, 1 seconds to Subscriber 1 and 4 seconds to Subscribers 2. While Log Readers not as fast as expected, improving latency to Subscriber 2 will cut in half overall latency.
TSQL Commands to explore Tracer Tokens
The TSQL command below sp_posttracertoken can be used to POST a Tracer Token. This command can be executed in SQL Server Management Studio or for running historical reference create a SQL Agent Job and post a Tracer Token at scheduled interval.
–For a historical perspective create
–a SQL Agent JOB to insert a new
–Tracer Token in the publication database.
USE [AdventureWorksLT]
Go
EXEC sys.sp_posttracertoken @publication = ‘TranProducts’
Go
–Token Tracking Tables
USE Distribution
Go
–publisher_commit
SELECT Top 20 * FROM MStracer_tokens
Order by tracer_id desc
–subscriber_commit
SELECT Top 20 * FROM MStracer_history
Order by parent_tracer_id desc
“Publisher to Distributor” (Log Reader) still showing “Pending”.
“Null” in Tracer Token tables or “Pending” in Replication Monitor indicates Log Reader is not running or there are so many un-replicated transactions the Log Reader has yet to read the Tracer Token from the Transaction Log. First, verify the Log Reader is running and no errors have been reported in the Log Reader History.
Using Replication Monitor selecting Agents we clearly see the Log Reader Agent is not running. Last Action shows it didn’t fail, but was Stopped.
Right-click the Log Reader Agent to “View Details”. Agent statistics along with errors encountered will be displayed in the “Details”. Note the “Stop Agent” is grayed out and “Start Agent” is selectable. Again confirming the Log Reader Agent is stopped. This menu also provides access to the Agent Profiles and individual command line properties for customizing the Log Reader Agent settings.
Tracer Token Tracking Tables
You can also cross verify by looking at the last updates recorded in the MSlogreader_history table. If you have more then one Publication include a WHERE clause for the correct agent_id.
–Check Log Reader Agents Table for correct Agent_id
SELECT id, publisher_db
FROM distribution.dbo.MSlogreader_agents
id publisher_db
———– ———————————————
1 AdventureWorksLT
–Check Agent History Table for last updates
SELECT top 10 agent_id, start_time,comments
FROM distribution.dbo.MSlogreader_history
ORDER BY time desc
agent_id start_time comments
———– ———————– ———————————————
1 2010–03–07 12:46:21.503 The process was successfully stopped.
1 2010–03–07 12:46:21.503 No replicated transactions are available.
Log Reader Latency
If you see NULL showing, again this indicates the Log Reader has not yet read the Tracer Token in the published database’s transaction log. If you’ve verify the Log Reader Agent IS running but Tracer Token are still not being picked up, check out ReplTalk blog posting on Troubleshooting Replication Conversations for additional troubleshooting steps.
–publisher_commit
SELECT Top 20 * FROM MStracer_tokens
Order by tracer_id desc
tracer_id publication_id publisher_commit distributor_commit
———– ————– ———————– ———————–
–2147483431 1 2010–03–10 20:08:00.400 NULL
Subscriber Offline
As this screenshot shows, even without Tracer Token I can see the Distribution Agents have either been stopped or the subscribers have been offline for a long time. Next steps would be 1) Verify Distribution Agents are running, 2) Check Agent History in Replication Monitor for any errors, 3) Confirm Subscribers are up and running and you can connect.
Subscriber Latency
Tracer Token tables can help explore subscriber latency. Queries below show end-to-end timestamps for Tracer Tokens. The script and output look best if pasted into SQL Server Management Studio.
–End to End time
SELECT publication_id, agent_id,
t.publisher_commit, t.distributor_commit, h.subscriber_commit
FROM MStracer_tokens t
JOIN MStracer_history h
ON t.tracer_id = h.parent_tracer_id
ORDER BY t.publisher_commit DESC
–Historical Breakdown (in seconds)
SELECT publication_id ‘PubID’, agent_id ‘AgentID’,convert(smalldatetime,t.publisher_commit) ‘PubCommit’,
convert(smalldatetime,t.distributor_commit) ‘DistCommit’,
convert(smalldatetime,h.subscriber_commit) ‘SubCommit’,
Datediff(s,t.publisher_commit,t.distributor_commit) as ‘Dist(sec)’,
Datediff(s,t.distributor_commit,h.subscriber_commit) as ‘Sub(sec)’
FROM MStracer_tokens t
JOIN MStracer_history h
ON t.tracer_id = h.parent_tracer_id
ORDER BY t.publisher_commit DESC
/*
–Slow LogReader
PubID AgentID PubCommit DistCommit SubCommit Dist(sec) Sub(sec)
———– ———– ———————– ———————– ———————– ———– ———–
1 3 2010-02-25 20:38:00 2010-02-25 20:53:00 2010-02-25 20:54:00 902 5
1 4 2010-02-25 20:38:00 2010-02-25 20:53:00 2010-02-25 20:54:00 902 7
1 3 2010-02-25 20:46:00 2010-02-25 20:53:00 2010-02-25 20:54:00 427 5
1 4 2010-02-25 20:46:00 2010-02-25 20:53:00 2010-02-25 20:54:00 427 7
1 3 2010-02-25 20:46:00 2010-02-25 20:53:00 2010-02-25 20:54:00 425 5
1 4 2010-02-25 20:46:00 2010-02-25 20:53:00 2010-02-25 20:54:00 425 7
–Slow Distributor on 1 subscriber
PubID AgentID PubCommit DistCommit SubCommit Dist(sec) Sub(sec)
———– ———– ———————– ———————– ———————– ———– ———–
1 3 2010-03-04 17:10:00 2010-03-04 17:10:00 2010-03-04 17:11:00 1 42
1 4 2010-03-04 17:10:00 2010-03-04 17:10:00 2010-03-04 17:11:00 1 4
1 3 2010-03-04 17:28:00 2010-03-04 17:28:00 2010-03-04 17:29:00 6 36
1 4 2010-03-04 17:28:00 2010-03-04 17:28:00 2010-03-04 17:28:00 6 3
1 3 2010-03-04 17:30:00 2010-03-04 17:30:00 2010-03-04 17:31:00 5 33
1 4 2010-03-04 17:30:00 2010-03-04 17:30:00 2010-03-04 17:30:00 5 2
*/
–By Agent_id
SELECT publication_id ‘PubID’, agent_id ‘AgentID’,convert(smalldatetime,t.publisher_commit) ‘PubCommit’,
convert(smalldatetime,t.distributor_commit) ‘DistCommit’,
convert(smalldatetime,h.subscriber_commit) ‘SubCommit’,
Datediff(s,t.publisher_commit,t.distributor_commit) as ‘Dist(sec)’,
Datediff(s,t.distributor_commit,h.subscriber_commit) as ‘Sub(sec)’
FROM MStracer_tokens t
JOIN MStracer_history h
ON t.tracer_id = h.parent_tracer_id
ORDER BY agent_id,t.publisher_commit DESC
/*
PubID AgentID PubCommit DistCommit SubCommit Dist(sec) Sub(sec)
———– ———– ———————– ———————– ———————– ———– ———–
1 3 2010-03-03 09:38:00 2010-03-03 09:38:00 2010-03-03 09:38:00 2 3
1 3 2010-03-03 09:39:00 2010-03-03 09:39:00 2010-03-03 09:39:00 5 3
1 3 2010-03-04 16:43:00 2010-03-04 16:43:00 2010-03-04 16:43:00 3 1
1 3 2010-03-04 17:10:00 2010-03-04 17:10:00 2010-03-04 17:11:00 1 42
1 3 2010-03-04 17:28:00 2010-03-04 17:28:00 2010-03-04 17:29:00 6 36
1 3 2010-03-04 17:30:00 2010-03-04 17:30:00 2010-03-04 17:31:00 5 33
1 3 2010-03-04 17:32:00 2010-03-04 17:32:00 2010-03-04 17:33:00 2 36
*/
—Find worst performing subscribers.
SELECT
convert(varchar(10),agent_id) as ‘agent id’,
max(Datediff(s,distributor_commit,subscriber_commit)) as ‘MAXTime To Sub (sec)’,
avg(Datediff(s,distributor_commit,subscriber_commit)) as ‘AVG Time To Sub (sec)’
FROM MStracer_tokens t
JOIN MStracer_history h
ON t.tracer_id = h.parent_tracer_id
group by agent_id
order by 2 desc
/*
agent id MAXTime To Sub (sec) AVG Time To Sub (sec)
———- ——————– ———————
3 57391 21372
4 286 9
*/
–Well, so who is Agent 3?
SELECT id, name, publisher_db, publication
FROM distribution.dbo.MSdistribution_agents
/*
id name publisher_db publication
—– ——————————————— ——————— ————
3 CHRISSKACER-AdventureWorksLT-TranProducts-CH AdventureWorksLT TranProducts
4 CHRISSKACER-AdventureWorksLT-TranProducts-CH AdventureWorksLT TranProducts
*/
–Use “name” from Agents to retrieve Job Properties
— command line to get Subscriber Name
SELECT sjs.step_name, sjs.command from msdb.dbo.sysjobsteps sjs
join msdb.dbo.sysjobs sj
on sj.job_id = sjs.job_id
Where name = ‘CHRISSKACER-AdventureWorksLT-TranProducts-CHRISSKACER-3’
/*
step_name command
————————————— ——————————————-
Distribution Agent startup message. sp_MSadd_distribution_history @perfmon_i
Run agent. -Subscriber [CHRISSKACER]
-SubscriberDB [AdventureWorks_Tran
Detect nonlogged agent shutdown. sp_MSdetect_nonlogged_shutdown @subsystem =
(3 row(s) affected)
*/
Divide and Conquer
Know where the latency problem is occurring allows you to focus on the real problem. Tracer Token can help divide Transactional Replication latency into Log Reader Agent problem v. Distribution Agent Problems. Check out ReplTalk blog posting on Troubleshooting Replication Conversations for additional Log Reader and Distributor troubleshooting steps.
Great article, one of its kind which actually explains how to track replication times.
I’m new to DBA and was looking for this kind of article to understand Replication Latency. I did find it much helpful for Latency troubleshooting. Thanks for providing this well summarized information.