Example of troubleshooting Distribution Agent errors
I thought it would be helpful to post a Replication Distribution Agent troubleshooting case to show more about replication components and troubleshooting approaches.
Problem:
SQL Server Distribution Agent reported “Failed” in Replication Monitor. To capture text of the message we added the following parameters to the Distribution Agent job and restarted the job. The –output parameter writes Agent log to text file showing each step the Distribution Agent was performing along with actual text of message.
-output c:\temp\dist.log and -commitbatchsize 1
Output
2018-12-29 00:17:53.917 Agent message code 8144. Procedure or function sp_MSupd_dboAddress has too many arguments specified.
2018-12-29 00:17:53.932 ErrorId = 35, SourceTypeId = 0
ErrorCode = ‘8144’
ErrorText = ‘Procedure or function sp_MSupd_dboAddress has too many arguments specified.’
2018-12-29 00:17:53.948 Adding alert to msdb..sysreplicationalerts: ErrorId = 35,
Transaction Seqno = 000a261100001560013e00000000, Command ID = 1
Not needed for this case, but often we capture Replication commands being executed via SQL Server Extended Events or Profiler Trace RPC and Batch events along with errors and warning.
Background:
Customer’s Transactional Replication publication consisted of a couple of very large tables generating timeouts when initial snapshot was being applied to the subscriber. To get around this problem, the subscriber was setup using a Backup/Restore from the Publisher. Steps-by-step is documented in doc.Microsoft.com and https://repltalk.com/how-to-manually-synchronize-replication-subscriptions-by-using-backup-or-restore/ . Distribution Agent worked for a few minutes, then failed with error above.
How is works:
In Transactional Replication, the LogReader agent is picking up committed transaction from the published database’s transaction log, writing the Insert\Update\Delete commands to the Distribution database. The Distribution Agent is picking up those commands and applying on the Subscriber. Changes are not stored in distribution database as SQL, i.e. “update table set column a = 1 where primarykey = ‘abc’ “, but instead as calls stored procedure along with a parameter list. The Distribution Agent calls these Insert/Update/Deleted stored procedures on the subscribers with the appropriate parameter list. These stored procedures have format sp_MSups_<schema><tablename> as in sp_MSupd_dboAddress, sp_MSdel_dboAddress, or sp_MSins_dboAddress.
Approach:
The error indicates a mis-match in the number of parameters (columns) in the command stored in the Distribution DB compared to the number of columns in the Replication created stored procedure. We needed to see which one is correct.
First step was to compare the SCHEMA of the Published database to the Subscriber. Since the Subscriber was a backup of the Publisher, I suspected them to be the same, but you never know unless you check. We executed the following command on both the Pub and Sub and yes, they were identical.
sp_help Address
Next was to look at actual command text stored in the Distribution database. The text is stored as binary, however, using Replication built-in command sp_browsereplcmds and optional parameters we could return text of the transaction of interest. We used the Transaction ID shows in the error message.
Transaction Seqno = 000a261100001560013e00000000 >>>>drop off trailing 8 0<<<<
sp_browsereplcmds @xact_seqno_start = ‘0x000a261100001560013e’, @xact_seqno_end = ‘0x000a261100001560013e’
Output
{CALL [sp_MSupd_dboAddress] (,,,,,,,,,,2009-12-28 02:22:10.000,,,,,2009-12-27 02:22:13.683,,,620190,0×008400)}
We confirmed table schema between Publisher and Subscriber are the same. We can see command and parameters being stored in the Distribution database that generated the error. Next check how these parameters match to the Subscribers stored procedure code. To get that we executed command below on the Subscriber.
sp_helptext sp_MSupd_dboAddress
Output:
CREATE procedure [sp_MSupd_dboAddress]
@c1 int,@c2 int,@c3 char(1),@c4 varchar(40),@c5 varchar(40),@c6 varchar(30),@c7 varchar(30),@c8 varchar(30),@c9 varchar(30),@c10 varchar(10),@c11 datetime,@c12 varchar(7),@c13 varchar(7),@c14 char(1),@c15 varchar(50),@c16 datetime,@c17 varchar(4),@c18 varchar(2),@pkc1 int
as
begin
update [dbo].[Address] set
[person_id] = @c2
,[address_type] = @c3
. . . .
,[uc_code] = @c18
where [address_id] = @pkc1
Looking at the code I see the last “expected” value is a PrimaryKey used in the WHERE clause to update 1 row. However, the sp_MSupd_dboAddress parameter list has a binary value 0x008400 as the last parameter. Clearly the parameter list doesn’t match, but which is right?
What is the expected behavior?
When troubleshooting these problems, sometimes best to step back and determine what is “expected” behavior. Setting up quick test to learn what’s expected may provide clue as to what’s broken. We do this for performance troubleshooting when capturing “baseline” data. Following this logic my next step was to setup a simple Transactional Replication publication using sample AdventureWorks database.
Using the Replication Wizard to publish 1 table, then scripting out the subscriber stored procedure, I could clearly see bitmap parameter along with additional logic within the stored procedure. Since the “broken” subscriber didn’t have this expected code to handle the correct number of parameters, I knew the Replication generated stored procedures on the Subscriber was incorrect and needed to be updated.
How to correct the problem?
SQL Server includes stored procedure sp_scriptpublicationcustomprocs to re-generate new set of subscriber stored procedures. Execute this command on the Publisher with OUTPUT AS TEXT. Then execute the output on the Subscriber. Make sure to increase the TEXT COLUMN WIDTH to 5000 in the Query Properties window or your stored procedure code will get truncated at the default 256 characters. Yeah, I found that one out the hard way. Thankfully, the code automatically includes all DROP then CREATEs SPs needed by the Subscriber and I just ran it a 2nd time with a wider text window.
sp_scriptpublicationcustomprocs ‘<publication>’
Once new create stored procedure scripts were executed on the subscriber, the Distribution Agent executed the correct commands with matching parameter list, no reinitialization required. This same command can be used anytime the subscriber stored procedures are accidently DROPPED.