Sunday 14 October 2012

Exploiting filtering for SQL replication

I came across a novel way of making use of SQL Servers replication the other day.  It erked me so much I thought I would share it with the rest of my casual readers to see what they think about it.  I may just be too much of a purist but the way it was implemented went against everything I've learnt as a computer scientist.

Replication Filters


For those that are unfamiliar with replication, the idea is to propegate data from a central database down to subscribers.  In terms of a business model, this might be head office as the replication publisher and the branches are the subscribers.

For each block of data that needs to get replicated, you can apply a filter as not all data may be relevant to the subscribers.  This would normally be in the form of a Flag for one of the columns, or where the date is within the last x days etc.  The options are only limited by the design of ones database, or in this case, the person's imagination.

The filter applied was:
SELECT <published_columns> FROM [dbo].[Branch] WHERE Code = HOST_NAME()

*HOST_NAME() usually returns the machine name that the script is running on
 

Publisher and Subcriber


The normal process is for the Publisher to Push the data, or the subcriber to pull the data.  Either is pefectly valid, it just depends where you want the processing load to lie.  If you have a lot of subscribers it would be considered  best practise to focus the load on the subscribers so therefore it would be a pull subscription.

When the subscriber runs , it issues a command to SQL server to synchronise the data.  There are various switches and options that can be applied to this command.  One of them is [Hostname].  In this particular instance they have specified a different hostname depending on which branch the subscription is run on.  Therefore the Hostname matches up with the Code column of table Branch.

replmerg.exe" -Publisher [EPOSTEST] -PublisherDB [POSCentral] -Publication [Merge_POSCentral] -Subscriber [eckpostest2] -SubscriberDB [POSPEN] -SubscriptionType 1 -SubscriberSecurityMode 1 -Distributor [EPOSTEST] -HOSTNAME [PEN] -QueryTimeout 300 -LoginTimeOut 30


Aside from the fact this is not how the [Hostname] flag was intended to be used it masks what is truely happening.  When SQL scripts run the hostname and other windows data is used to audit who ran the script on which machine and at what time.  To someone who is unfamiliar with this logic in the replication may end up searching for a machine on the domain that doesn't exist.

What's more is this functionality is not guaranteed to work with future versions of SQL server because the hostname flag is a minor option of replication.

No comments:

Post a Comment