SQL Server Replication-
The Crib Sheet
For things you need to know rather than the things you want to know
Contents
- Introduction
- Replication topologies
- Replication Methods
- Replication Agents
- Monitoring Replication
- Checking throughput
- Validating
- Changing the settings
- Articles
- Updating articles
- Programming Replication Topologies
- Further Reading
Introduction
Replication is intended to be a way of distributing 
data automatically from a source database to one or more recipient 
databases. As such, it can have obvious uses in a distributed system. It
 has also been used to implement high-availability systems. It is not 
useful for one-off synchronization, or for simply copying data. It is 
intended as a long-term data relationship between databases. Typical 
uses are in:
- Data warehousing and reporting
- Integrating data from several, possibly only partially connected, sites
- Improving scalability and availability
- Integrating heterogeneous data from other databases via OLE DB, integrating data from mobile users, getting data to and from Point-Of-Sale systems
- Offloading/delegating batch processing tasks.
- Distributing data 'owned' by a particular application to other applications that are 'consumers' of that data. (For example, sales records to reporting services, manufacturing stock levels to purchasing systems.)
- Creating several instances of a database to distribute load on it
- Updating a central database with information from a number of remote Laptops that might be only partially connected, and to resynchronise the laptops.
Replication uses a 'Magazine Publishing' vocabulary. Anything from an 'Article' to an entire database can be replicated. An Article is the smallest component of distribution, and can be a table, procedure or function. If it is a table, then a filter can be applied to it so that only certain rows or columns are replicated. This 'Magazine Publishing' analogy can be confusing because, in replication, a Subscriber can sometimes make updates, and a Publisher usually sends out incremental changes to the articles in a publication.
A 'Publisher' maintains the original copy of the data. It holds the definition of the 'Publication', which defines the 'articles' that are to be 'published'. (The database with the original location of the data determines what is to be distributed).
A 'Subscriber' receives the articles from a publisher. It can subscribe to one or more publications. Any database can take on either role or even both roles at once.
A Distributor is a specialist database that runs the 'Replication agents'.
Replication is not part of the SQL Server engine, but an external application. This makes it much easier to involve other database systems in replication. Any SQL Server database, or other database system with an OLE DB provider, can be a publisher or subscriber in snapshot or transactional replication.
It is essential to plan out the replication in detail as a first stage, and to be very certain of the type of replication you wish to implement. A common mistake is to use replication in cases where a much less complex solution is possible.
A problem with production systems using replication is the difficulty of restoring the topology after a disaster.. This requires a fully documented recovery strategy, which has to be periodically tested and practiced. This means that the whole replication topology and configuration must be scripted so it can be re-created in an emergency, even if the system was originally built using the GUI tools. The Object Browser (or Enterprise Manager) makes the initial deployment relatively simple to do, and there are plenty of step-by-step guides, but it is not the best option when trying to restore an existing topology, and settings, in order to achieve a recovery from major failures.
Problems often follow from developers adding or dropping articles, changing publication properties, and changing schema on published databases. It is therefore best to create the replication once the design of the publisher is relatively stable.
Replication topologies
In most topologies, it makes sense for publishers, distributors, and subscribers to be on separate physical hardware.
Central Publisher
The commonest form of replication is to have a single
 publisher with the source data, with one or more subscribers. The 
Distributor database can be on the same, or preferably different, 
server.
Central Subscriber
Often, where the data from several databases need to 
be 'warehoused' centrally in an OLAP or reporting database, one will 
find a single 'reporting' database subscribing to several publications.
One can come across other topologies such as 
'bi-directional' and 'peer to peer' which are really special cases of 
Central Publisher or Central Subscriber and use transactional 
replication
Publishing Subscriber
The distribution of data can be relayed to other 
subscribers via a publishing subscriber. This allows replication to be 
implemented over low-bandwidth WANs to a subscriber that, in turn, 
distributes it to other servers within its high-bandwidth LAN
Replication Methods
How Replication Works
Replication begins with the initial synchronization of the published objects between the Publisher and Subscribers, using a snapshot. A snapshot is a copy of all of the objects and data specified by a publication. After the snapshot is created on the publisher, it is delivered to the Subscribers via the distributor.For Snapshot replication, this is sufficient. For other types of replication, all subsequent data changes to the publication flow to the Subscriber as they happen, in a queue, or on request.
Snapshot Replication
The snapshot replication process provides the initial
 synchronization for transactional and merge publications. However, in 
several cases, this initial synchronization is all that is necessary. 
This would include circumstances where data hardly changes, or if the 
latest version of the data is not essential to the subscriber, where the
 amount of data is small, or if a large number of changes takes place 
rapidly.
Snapshot replication involves copying the articles 
that make up the publication. Normally, if they exist already on the 
subscriber, they are over-written, though this behavior can be changed. 
Snapshot replication is more expensive in terms of overhead and network 
traffic and only takes place at intervals. Because locks are held during
 snapshot replication, this can impact other users of the subscriber 
database. It is therefore more suitable for static data and 
enumerations. In SQL Server 2005, several articles can be processed in 
parallel, and interrupted snapshots can be recommenced from the point of
 interruption. Snapshots can be queued or immediate.
Data changes are not tracked for snapshot 
replication; each time a snapshot is applied, it completely overwrites 
the existing data.
Transactional Replication
Transactional replication is used if:
- Changes to the data must be propagated immediately
- The database application taking out a subscription needs to react to every change
- The Publisher has a very high volume of insert, update, and delete activity
- The Publisher or Subscriber is a different database application reached via OLE DB.
Essentially, Transaction replication distributes data
 in one direction, but transactional replication does offer options that
 allow updates at the Subscriber. Once a snapshot replication has 
synchronized the subscribers with the publisher, all committed 
transactions on the publisher are then propagated to the subscribers in 
sequence, via distributed transactions. One can select a queued update 
or immediate, depending on requirements.
Peer-to-peer Replication
This is a special type of transactional replication 
in which every participant is both a publisher and subscriber (2005 
Enterprise only) and is most useful for up to ten databases in a 
load-balancing or high-availability group.
Bidirectional Replication
This is where two databases replicate the same 
articles to each other via a distributor. There must be loopback 
detection. Data conflicts aren't handled and the replication must be 
implemented in code, since the GUI doesn't support it.
Transactional replication tracks changes through the SQL Server transaction log
Merge Replication
Merge replication allows various sites to work autonomously and later merge updates into a single, uniform result.
Merge Replication is complex, but provides the means 
to implement part of a high-availability system, as well as its original
 purpose of serving mobile and disconnected users. It is designed for 
cases where the publishers are not in constant communication with the 
subscribers. After the initial snapshot synchronization, subsequent 
changes are tracked locally with triggers, and the databases are merged 
when in contact, using a series of rules to resolve all possible 
conflicts.
Merge replication is used when several Subscribers 
might need to update the same data at various times and propagate those 
changes back to the Publisher and thence to other Subscribers. It is 
also required in applications that involve Subscribers receiving data, 
making changes offline, and finally reconnecting with the publisher to 
synchronize changes with the Publisher and other Subscribers.
To make this possible, each Subscriber requires a 
different partition of data and there has to be a set of rules to 
determine how every conflict that takes place in the update of the data 
is detected and resolved. These conflicts occur when the data is merged 
because there can be no 'locking' and so the same data may have been 
updated by the Publisher and by more than one Subscriber.
Merge Replication does not use transactions. Merge 
replication uses a set of conflict-resolution rules to deal with all the
 problems that occur when two databases alter the same data in different
 ways, before updating the subscribers with a 'consensus' version. It 
normally works on a row-by-row basis but can group rows of related 
information into a logical record. One can specify the order in which 
'articles' are processed during synchronisation.
Merge replication tracks changes through triggers and metadata tables.
Replication Agents
Replication is done by several different agents, which 
are separate applications each responsible for part of the process. The 
replication agents should not be run under the SQL Server Agent account 
in a production system. Instead, they need the minimal permissions 
necessary to perform their function.
SQL Server Agent
This manages the overall replication process via SQL Server Agent jobs.
The Snapshot agent
Snapshot.exe executes on the 
Distributor. It extracts the schema and data defined by the publication,
 which is then sent to the subscriber via a 'snapshot folder'. It also 
updates status information on the distribution database. . It is used in
 all forms of replication
The Log Reader Agent
LogRead.exe is used in transactional
 replication to extract relevant committed transactions from the 
publisher's log, repackage them and send them to the distributor in the 
correct sequence.
Distribution Agent
Distrib.exe takes the snapshots, and log entries from the agents we've described, and dispatches them to the subscribers.
Merge Agent
ReplMer.exe is used only in Merge 
Replication to send a snapshot when the subscriber is initialized, and 
also exchanges transactions between publisher and subscriber
Queue Reader Agent
QrDrSvc.exe is used to queue the updates in transactional or snapshot replication when queuing has been specified.
Monitoring Replication
Many problems associated with replication can be 
avoided by .regular checks. The most obvious check is to make sure that 
the data has been transferred as expected. Periodic checks with SQL 
Compare and SQL Data Compare can be very useful in addition to the tools
 that come with Replication. Additionally the replication processes and 
jobs need to be checked to make sure they are working.
Checking throughput
The performance of replication must be regularly monitored, and performance-tuned as necessary.
The Replication Monitor is used to check on the 
operational state of publications, and inspect the history, and errors. 
Right-clicking the replication node in Object Explorer will gain access 
to it.
One of the most important concerns is the time delay,
 or latency, of transactions from the publications appearing in the 
subscriber database. At times of high transaction throughput on the 
publisher database, bottlenecks can occur. Whereas the stored procedure 
sp_browseReplCmds on the distribution database can tell you how far 
behind the synchronisation is at any particular time, one cannot 
determine where the problems lies just from the data. Tracer tokens are 
now used to measure the actual throughput of the replication 
architecture at any particular time to help diagnose such bottlenecks.
Validating
There is always an element of doubt as to whether the
 replication has entirely worked. There are stored procedures provided 
to compare the 'articles' on the publisher and subscribers to make sure 
they are the same.
The sp_publication_validation stored procedure 
validates the data associated with each article by calling 
sp_article_validation (after the articles associated with a publication 
have been activated). The sp_article_validation stored procedure invokes
 sp_table_validation stored procedure, which calculates the number of 
lines and, optionally, the checksum of the published table. It is 
considered good practice to perform a daily row-count and weekly 
checksum. SQL Data Compare is ideal for mending a broken replication.
The Distribution Agent raises the '20574' system 
message if validation fails, or the '20575' system message if it passes.
 The Distribution Agent will replicate changes to a subscriber even if 
the validation shows that the subscriber is out of synchronization. It 
is a good policy to configure the Replication Alert on the '20574' 
message so as to send E-Mail, Pager, or Network notification.
This validation approach will only work within 
certain restrictions. For example, it will not work if certain filters 
have been applied. They should be used with caution.
Changing the settings
It is best to use the default replication settings 
unless there are clear performance gains to be made, or if the 
application design forces the issue. However, one cannot assume that the
 changes will be generally beneficial to the entire topology without 
comprehensive testing.
Articles
Articles are the smallest unit of a publication. An 
article can be a table, view, stored Procedure or function. Where an 
article is based on a table or view, it can contain all the data or just
 part of it. These filters of two types.: More common are the static 
'WHERE' clauses, but filters can be used dynamically in Merge 
Replication to publish different 'content' (rows) to different 
'subscribers' (databases receiving data). These latter Filters are 
called 'Dynamic' and can be simple Row Filters, or Join Filters, where 
the selection of rows to publish is based on a join with other tables, 
rather than a simple WHERE clause.
Normally, any alteration to an article that is a table is propagated 
to all the subscribers. You can also opt to propagate schema objects 
associated with the article such as indexes, constraints, triggers, 
collation and extended properties.Updating articles
In Merge replication, the subscriber can update the 
article. This is, of course, a recipe for conflict, and these have to be
 resolved automatically. When the Merge Agent comes across a row that 
might have changed recently, it examines the history or 'lineage' of 
each site's version of the row to see if there is a conflict. If so, 
then the update that is finally used. Has to be based on either
- A "first wins" resolution,
- a user-specified priority scheme to determine the update to select,
- a customised resolution, using COM and stored procedures.
The 'lineage' is a history of changes in a table row 
in MSmerge_contents, which is maintained automatically when a user 
updates a row. Each column contains one entry for each site that has 
updated the row.
Conflicts to the data in the base table can occur 
within a column or a row. Most usual are column-tracked articles this 
means that, within any row, updates are only recognized as conflicts if 
the same column is updated by more than one subscriber. Occasionally, 
however, the business rules of the application may treat simultaneous 
changes to the any column within the row as a conflict, in which case 
row-level tracking is used.
Programming Replication Topologies
Replication agents and replication topologies can be 
administered and monitored remotely via SQL Scripts or RMO scripts. The 
task of building and maintaining replication topologies is made much 
easier of all parts of the deployments are scripted, even if this is 
done after the other methods such as wizards or RMO are used for the 
initial operation.
There are other uses for a replication script. It will be required in
 end-user application where, for example, such as s a pull subscription 
is synchronized when the user clicks a button, or where a routine 
administration task such as monitoring replication throughput is 
performed from a custom console. It is also generally used for writing 
customized business rules that are executed when a merge subscription is
 synchronized.One can use the Object Explorer in SSMS, or the Enterprise Manager in earlier versions of SQL Server, to set up replication. BOL provide worked examples. Alternatively, RMO can be used with VB or C# to script the replication. Whatever system you use, it is a good idea to use the Transact SQL script as the reference for the replication topology you create
Ultimately, the functionality in a replication topology is provided by system stored procedures. The easiest approach is to use Transact-SQL script files to perform a logical sequence of replication tasks, because it provides a permanent, repeatable, copy of the steps used to deploy the replication topology that can, for example, be used to configure more than one subscriber. It also provides a measure of documentation and disaster-recovery. A script can be stored as a query object in a SQL Server Management Studio project.
Replication scripts can be created by hand, by the script generation of the replication wizards in SQL Server Management Studio, or by using Replication Management Objects (RMOs) to programmatically generate the script to create an RMO object.
When creating scripts to configure replication, it is best to use Windows Authentication so as to avoid storing security credentials in the script file. Otherwise you must secure the script file
Further reading:
SQL Server ReplicationFor the details of implementing replication, all the steps are documented here in Implementing Replication
details on configuring and maintaining a replication are here Configuring and Maintaining replication
For details on peer-to-peer replication, read Peer-to-Peer Transactional Replication
Some of the wider issues and dangers of replication are discussed here Data Replication as an Enterprise SOA Antipattern, in the excellent Microsoft Architecture Journal
 
 
 
No comments:
Post a Comment