Walkthrough Publisher, Distributor, Subscriber in AlwaysOn Availability Groups

Share this Post

UPDATE: 4/10/2020

The steps below walk through setting SQL Server 2016 Replication Publisher, Distributor, and Subscriber each in an Always On Availability Group with one set of replicas residing one 1 subnet and 2nd set on another subnet simulating 2 different data centers (Pub1, Dist1, Sub2) <–> (Pub2, Dist2, Sub2).

More Information:

Configure replication with Always On availability groups

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/configure-replication-for-always-on-availability-groups-sql-server?view=sql-server-2017

Set up replication distribution database in Always On availability group

https://docs.microsoft.com/en-us/sql/relational-databases/replication/configure-distribution-availability-group?view=sql-server-2017

Replication Subscribers and Always On Availability Groups

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/replication-subscribers-and-always-on-availability-groups-sql-server?view=sql-server-2017

Multi-Subnet Environment:

This walk-through contains a domain controller and 6 SQL Server. As you can see from the names below, each replication roles (Distributor, Publisher, Subscriber) will be running on their own pair servers in their own AlwaysOn Availability Group configuration across two subnets.

The two networks subnets simulate Replication roles running in two different data centers. For this configuration, both data centers are on the same “JAMES” domain, named after my late father and long-time teacher at Cochrane-Fountain City high.

The SQL Server installations have been upgraded to SQL 2016 CU13 to support distribution role in an Availability Group. As required, the Distributor role in replication topology will be in their own SQL Server instances in their own AlwaysOn Availability Group.

SQL Server Management Studio

Download latest SQL Server Management Studio with failover Distributor support,  The upgrade was installed on all servers either configuring replication or running Replication Monitor. https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms

Initial Publisher Setup

The sample database AdventureWorks was restored as “AW“; the database rename was not required.

https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks

Database AW set into Full Recovery, 1 full and 1 transaction log backup were taken.

The Agent XP server configuration setting was enabled for SQL Agent Jobs and a database maintenance plan was configure to ensure ongoing FULL and Transaction Log backups were maintained.

Using Availability Group Wizard in SQL Server Management Studio, PubAG with listener PubAGListener was created for the AW database for replicas PUB1 and PUB2.

In PubAG properties, PUB1, and PUB2 were set to “Readable Secondary” = Yes as required for Replication. Optionally, Automatic Seeding allowed database AW to be copied to Secondary without manual backup-restore.

In this example, with 2 subnets, 2 IP address are added for the PubAGListener.

Subscriber Initial Setup

Since the subscriber is initialized via Replication Snapshot, simply create a new empty database, here called AWTEST on SUB1. Like the AW database, the AWTEST subscriber database was set to FULL recovery, a full and transactional log database backups were taken and a database maintenance plans backup database and TLOG (*.trn) file at regular intervals keeping TLOG at reasonable size and supporting recovery.

Like the Publisher, an AlwaysOn Availability Group SubAG with listener SubAGListener was created for the AWTEST subscriber database using the SQL Management Studio Wizards.

Replica nodes are SUB1 and SUB2, both again configured for Readable Secondary to allow load balance “read” activity from either subscriber and Seeding Mode = Automatic making initial setup without backup-restore.

In this topology, the Log Reader and Distributor jobs run on the Distributor server. Therefore SQL Agent is not required for Replication on the PUB1, PUB2, SUB1, or SUB2. However, it is required for non-replication SQL Server Agent jobs such as maintenance plans.

Distributor Setup

https://docs.microsoft.com/en-us/sql/relational-databases/replication/configure-distribution-availability-group

The @password is assigned to the repl_distributor linked server used by Replication for administrative tasks. This same password is used in later scripts.

Step 1: Configure DIST1, DIST2 as distributors using TSQL only

Step 3: Create Distributor’s Availability Group for DIST1 and DIST2 using SSMS Wizard or TSQL scripts

You can create the Distributor’s Availability Group using using TSQL or New Availability Group Wizard.  If using New Availability Group Wizard, first create a blank “scratch” user database then create the Availability group via Wizard.  Once the AG is created, use TSQL or SSMS steps below to add Distribution database(s) to newly created Distribution AG.

The “scratch” database is needed as the AG Wizard does not show system databases like the Distribution database(s).  Once the Distributor’s AG is created the “scratch” database and be removed and dropped.

Here we’ve created the Availability Group DisAg for replicas DIS1 and DIS2. You can configure the distributor role listener when creating the availability group.

Next step is to add the ‘distribution‘ into the DisAg availability group.   Steps below configuring the AG for distribution databases using TSQL scripts. The “GRANT CREATE ANY DATABASE;” allows support for automatic seeding.

Step 4: Configure listener (DisAgListener). Can be performed when DisAG was created.

The Distributor listener may exists if created using New Availability Group Wizard.  While not all configuration use multiple subnets, in this example, the Distributor listener DisAGListener has 2 subnets.

Optional: Distributor Listener Custom Port : If you’re using custom port for your Distributor listener, as shown below, you’ll need to create alias for your Distributor listener on each Publisher and Distributor replica.  Create both 32-bit and 64-bit Client Aliases using SQL Server Configuration Manager.  In this example with 2 Publisher replicas and 2 Distributor replica you’d need to create a total of (2+2 = 4 replica x 2 32/64bit = 8) eight aliases.

With requirement for Distributor alias when listening on custom port, evaluate if custom port is required as Distributor listener is only by the Replication agent when connecting to the Distribution database.

Listeners in Windows Cluster Administrator

Selecting Roles, then Availability Group, shows resulting cluster configuration Client Access Point for both virtual IP subnet addresses.

RegisterAllProvidersIP=0

For SQL Server 2017 and lower, when using Replication in a multi-subnet environment using RegisterAllProvidersIP =1 setting, Replication Agents may timeout when starting.  For example, the Log Reader agent upon startup creates a temporary link server to validate availability group.  This linked server is not configured using MultiSubnetFailover=1 (used by clients when RegisterAllProvidersIP =1) and will timeout.

To work around this problem, change the cluster Access Point Name properties to RegisterAllProvidersIP  = 0 and reduce the HostRecordTTL  = 10 (10-120 seconds) registering only the “active” Listener\IP in DNS.

More information: https://docs.microsoft.com/en-us/archive/blogs/alwaysonpro/connection-timeouts-in-multi-subnet-availability-group

When the distribution database is participating in an Availability Group, the status in SSMS Object Explorer doesn’t show “synchronizing” as it does with user databases. Use AlwaysOn Dashboard to confirm “Healthy”.

Step 5: For recovery and log truncation configure Full and TLOG Backups and maintenance plans

Step 6: On DIST2 add distribution database, no additional parameters.

Step 7: Add all publisher(s) to all distributor(s)

In this walkthrough Publishers PUB1 and PUB2, also participate in an Availability Group with Distributors DIS1 and DIS2. These steps connect each distributor and to both publisher servers. Notice reference to DNS server names, not FQND, and not listener names. The working_directory is shared location for Replication Snapshot files. For this walkthrough I’m using “\\dc\share\repl” but can be any network location accessible from all servers in the topology.

Optionally, if setting up subscribers using backup\restore, snapshot folder is not used and location can be any valid local folder.

When step completes, linked servers on DIST1 and DIST will point to PUB1, PUB2, and repl_distributor.

Publisher Workflow

When completes, linked server on PUB1 will have a linked server to repl_distributor which is mapped in sys.sysservers to DISAGLISTENER.

:Connect PUB2

The error below occurs when not logged on directly to PUB2 while executing sp_adddistributor. Make Remote Desktop connection to PUB2 then execute sp_adddistributor.

/*

OLE DB provider “SQLNCLI11” for linked server “repl_distributor” returned message “Unable to complete login process due to delay in opening server connection”.

Msg 7303, Level 16, State 1, Procedure sp_adddistributor, Line 168 [Batch Start Line 102]

Cannot initialize the data source object of OLE DB provider “SQLNCLI11” for linked server “repl_distributor”.

*/

Here we can see PUB2 now has linked server to repl_distributor which is mapped in sys.sysservers to the distributor listener DISAGLISTENER.

Create Publication and specify redirected publisher

On active Publisher replica, enable the publisher database for publishing.

Create publication using SSMS Replication wizard or via TSQL replication scripts.  No changes needed at this step when publication is being created.  If using TSQL scripts, don’t execute “sp_addsubscription” at this time.

NOTE: If Publisher “listener” is using a non-default port, specify the port number in the @redirected_publisher parameter as shown below.

Failure to specify the Publisher Listener port will result in Log Reader Agent errors such as:

Status: 32768, code: 53044, text: ‘Validating publisher’.

Status: 0, code: 21879, text: ‘Unable to query the redirected server

Status: 0, code: 22037, text: ‘Errors were logged when validating the redirected publisher.’.

Validate redirected publisher

Make remote desktop connection to DIST1 and execute command to validate redirected publisher.

sp_Add_Subscription using TSQL Script to specify subscriber’s listener.

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/replication-subscribers-and-always-on-availability-groups-sql-server

In this walkthrough our subscriber also participates in an availability group. We’ll use PUSH subscription and specify the subscriber’s Listener SubAGListener as the @subscriber.

Checking the Distribution Agent job step properties, notice the listener name for the -Subscriber parameter. The re-directed Publisher entry we made earlier will handle the PUB1 to PubAGListener mapping.

The DIS2 needs to connect to Subscriber’s Listener when replication changes. Execute command below on DIS2 to add a linked server to the Subscriber’s Listener SubAGListener. Note connection is using Listener name, not FQDN, not server name. If the subscriber was not participating in an Availability Group, then @server would equal subscriber’s SQL Server name.

To support Replication, a linked server from Publishers to Subscribers is also required. This was created on PUB1 when sp_addsubscription was executed. For PUB2, we’ll need to directly call sp_addlinkedserver command using the subscriber’s Listener SubAGListener.

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/configure-replication-for-always-on-availability-groups-sql-server

When setup, Publisher(s) will show linked server connections to repl_distributor (mapped to DISAGLISTENER and to the subscribers Availability Group listener SubAGListener.

Missing Replication Agent Jobs

Before fail-over, if you compare SQL Agent Replication Job from DIST1 and DIST2 you may notice difference. However, after an Availability Group fail-over, a new Agent job called “Monitor and sync replication agent jobs” running every 1 minute, and updates SQL Agent jobs on DIST2 with the correct parameters.

Testing Publisher Fail-over

Using Replication Monitor register the Publisher’s listener PUBAGLISTENER.

Connect to the Distributor’s listener to retrieve publisher settings.

If SQL Server Management Studio 18.x is installed, Replication Monitor will correctly display the Publisher’s listener PubAGListener, fully supporting all functionality including Tracer Tokens.

When initiate fail-over from SQL Management Studio you’ll notice the Log Reader agent enter failed state and attempt to reconnect to the Publisher’s listener PubAGListener.

Remember to always initiate fail-over from SQL Management Studio or TSQL, never from Cluster Administrator.

Replication Monitor, as expected, displays Log Reader Listener’s connection to publisher was closed during Publisher’s fail-over.

Looking at the Log Reader SQL Agent Job history, you’ll see normal delivery, then “connection forcibly close”, “The replication agent encountered an error and is set to restart within the job step retry interval.” (fishhook), followed by “The replication agent has been successfully started” re-connection looking something like this:

Testing Distributor Failover

Again, using SSMS, you can initiate fail-over of the Distributor or Subscriber role from either the Primary or Secondary replica.

When the Distributor Availability Group fails over, both the Log Reader (Pull) and the Distribution Agent (Push) jobs fail, but then are restarted on the new Primary distributor.

Expected Log Reader messages during Distributor fail-over:

Error messages:

The process could not execute ‘sp_repldone/sp_replcounters’ on ‘PUB2’. (Source: MSSQL_REPL, Error number: MSSQL_REPL20011)

Get help: http://help/MSSQL_REPL20011

Only one Log Reader Agent or log-related procedure (sp_repldone, sp_replcmds, and sp_replshowcmds) can connect to a database at a time. If you executed a log-related procedure, drop the connection over which the procedure was executed or execute sp_replflush over that connection before starting the Log Reader Agent or executing another log-related procedure. (Source: MSSQLServer, Error number: 18752)

Get help: http://help/18752

The process could not set the last distributed transaction. (Source: MSSQL_REPL, Error number: MSSQL_REPL22017)

Get help: http://help/MSSQL_REPL22017

The process could not execute ‘sp_repldone/sp_replcounters’ on ‘PUB2’. (Source: MSSQL_REPL, Error number: MSSQL_REPL22037)

Get help: http://help/MSSQL_REPL22037

To verify when Replication agents have restarted on the new Distributor replica and reconnected, insert Tracer Token. Here the latency from Distributor fail-over to fully replicated the pending Tracer Token was 7 second.

Testing Subscriber Failover

Similar, when subscriber would failover to synchronized secondary replica, the Push Distribution agent will enter “retry”, then reconnect via Subscribers “SubAgListener” to the new primary replica.

Error messages:

Agent ‘PUB1-AW-TranProducts-SUBAGLISTENER-1002’ is retrying after an error

Once again Replication Monitor Tracer Tokens validate re-connection are complete and data is once again flowing through the topology.

Chris Skorlinski
Microsoft SQL Server Escalation Services


Share this Post
Tags:

3 thoughts on “Walkthrough Publisher, Distributor, Subscriber in AlwaysOn Availability Groups”

  1. Yes it works.
    YES IT WORKS!
    Thank you for this beautiful article. You are my saver.
    I can tell you that your action will also work for the following scenario:
    Two Servers DB1 and DB2 with two Enterprise Instances on each server
    – default instance
    – Distributor instance
    Where Published and Subscriber’s database both reside on the same instance but on different AGs.
    So we have three AGs:
    – PubAG
    – SubAG
    – DisAG
    All the magic is in carefully picking the right port configuration for each instance so that you can use each AG listener without a port or instance name to connect to each AG.

    Thanks again

    1. I just updated the article with instruction for creating the Listener’s Alias when using non-default ports as use of non-default ports makes it a little more complicated.

  2. Just in time, looking for the AG-Replication combination details of this sort. Excellent Article…!! Thank you Chris.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.