Friday, August 25, 2017

SQL Server Replication snapshot initialization taking long time to be applied

Hello all,
In today's post I want to share a tip that helped me troubleshoot an issue I was facing and couldn't find much documentation about it. To my friends that read and don't have much knowledge in databases, this entry will look like an estrange language but I promise I'm working in some post to walk you through the database world and understated some of the terms used here.

Issue: Had to reinitialize a subscription from a transactional replication. When doing the initialization with the snapshot, the process was taking a huge amount of time, about 13 hours and it has only applied the snapshot for 12 (out of 24) tables, based on that speed the entire process would have taken about 26-30 hours to complete.

The snapshot was generated in 12 minutes and it was generated taking advantage of  parallelism, however I noticed that when the snapshot was getting applied it was processing one file at the time (due to the parallelism of the snapshot generation, it split the snapshot files in 48 chunks for each table) so I started to wonder why the initialization was sequential (Serialized) whereas the generation was in parallel.

By doing a deep research on the web, I found lots of explanations such as this one (https://blogs.msdn.microsoft.com/sqlsakthi/2011/08/07/maxbcpthread-parameter-for-snapshot-and-distribution-agent-in-sql-server-replication/) which explain how the parameter "-MaxBcpThreads" work, however it doesn't explains how to get it implemented. You can find that parameter in the Distribution Agent for the publisher server, however you can't change the value from the user interface (UI) of SSMS.

So I stop trying to find for an specific solution through the web and started to build my own. Time to remove the dust and start managing the replications with T-SQL, so I looked for any stored procedures that would help me change that parameter so I found this "sp_change_agent_parameter" in there you can update the parameters for all the database agents. You can query your current Distributors but using the following query. lets call it Query1 for future reference:
SELECT
[publication] as Publication
,c.srvname as SubscriberName
,b.profile_name as Profile
,[name] as DistributionJobName
,b.profile_id
FROM [distribution].[dbo].[MSdistribution_agents] a
inner join msdb.[dbo].[MSagent_profiles] b
on a.profile_id = b.profile_id
inner join master..sysservers c
on a.subscriber_id = c.srvid
order by b.profile_name

From those results, validate if the publication you want to update is listed there, and take the value from the profile_id column.
Important note before updating this values, in the URL mentioned above, there is an important note: "When applying a snapshot that was generated at the Publisher using the concurrent snapshot option, one thread is used, regardless of the number you specify for MaxBcpThreads."
Meaning that you will need to verify the Sync_method that is configured with your publication, if it is Concurrent, then you will have to update your sync method before updating your Distribution agent properties, you can validate the method with this command "EXEC sp_helppublication @publication = 'YourPublicationName'" and look for the 7th column, this command will need to be executed on your published database. The column value will be a number, you can match each result to this property:
Synchronization mode:

0 = Native bulk copy program (bcp utility)
1 = Character bulk copy
3 = Concurrent, which means that native bulk copy (bcputility) is used but tables are not locked during the snapshot
4 = Concurrent_c, which means that character bulk copy is used but tables are not locked during the snapshot.
source(https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-helppublication-transact-sql)

Once that you have verify and noticed that you need to update the sync method to your publication, you can do it with the following command:

sp_changepublication
@publication = 'YourPublicationName',
@property = 'sync_method',
@force_invalidate_snapshot =1,
@force_reinit_subscription =1,
@value = 'native';
GO

Once that you have that set, you can use the following command to update the value you want for the maximum Bcp Threads where Profile_ID is the one you got from the Query1:

sp_change_agent_parameter
@profile_id = 4,
@parameter_name = '-MaxBcpThreads',
@parameter_value = '4'
go

What would this change do? Enable you to use x number of threads (based on what you configure in @parameter_value) to read the snapshot files and get it apply in parallel. One thing that worth's mentioning, when you check for your active executions (e.g. sp_who) you will notice that some of the parallel threads will get blocked, don't worry about it, is because you are dealing with the same table, you will gain that parallelism when you process various tables at the same time, so in here my advice is to configure that parameter_value to a number close to the half of the files that were generated always taking in account your available processors, you don't want to affect the performance for the other databases when trying to re-mediate one issue.

The result? In my environment I was able to apply the snapshot (24 tables) in 6hrs and 13 minutes, the Published database was a 1.5TB database and the snapshot weighted 185GB, without the change the process ran for more than 13hrs and only 12 tables (50% in number 35% in data) were applied, hope this tip helps you get the snapshot applied faster.

Thanks for reading!

No comments:

Post a Comment

Tools to connect to a SQL database