Vous êtes sur la page 1sur 6

Replication Publishing Model Overview

Replication uses a publishing industry metaphor to represent the components in a replication topology, which
include Publisher, Distributor, Subscribers, publications, articles, and subscriptions. It is helpful to think of Microsoft
SQL Server replication in terms of a magazine:

• A magazine publisher produces one or more publications

• A publication contains articles

• The publisher either distributes the magazine directly or uses a distributor

• Subscribers receive publications to which they have subscribed

Although the magazine metaphor is useful for understanding replication, it is important to note that SQL Server
replication includes functionality that is not represented in this metaphor, particularly the ability for a Subscriber to
make updates and for a Publisher to send out incremental changes to the articles in a publication.

A replication topology defines the relationship between servers and copies of data and clarifies the logic that
determines how data flows between servers. There are several replication processes (referred to as agents) that
are responsible for copying and moving data between the Publisher and Subscribers. The following illustration is an
overview of the components and processes involved in replication.

Publisher
The Publisher is a database instance that makes data available to other locations through replication. The Publisher
can have one or more publications, each defining a logically related set of objects and data to replicate.

Distributor
The Distributor is a database instance that acts as a store for replication specific data associated with one or more
Publishers. Each Publisher is associated with a single database (known as a distribution database) at the
Distributor. The distribution database stores replication status data, metadata about the publication, and, in some
cases, acts as a queue for data moving from the Publisher to the Subscribers. In many cases, a single database
server instance acts as both the Publisher and the Distributor. This is known as a local Distributor. When the
Publisher and the Distributor are configured on separate database server instances, the Distributor is known as a
remote Distributor.
Subscribers
A Subscriber is a database instance that receives replicated data. A Subscriber can receive data from multiple
Publishers and publications. Depending on the type of replication chosen, the Subscriber can also pass data
changes back to the Publisher or republish the data to other Subscribers.

Article
An article identifies a database object that is included in a publication. A publication can contain different types of
articles, including tables, views, stored procedures, and other objects. When tables are published as articles, filters
can be used to restrict the columns and rows of the data sent to Subscribers.

Publication
A publication is a collection of one or more articles from one database. The grouping of multiple articles into a
publication makes it easier to specify a logically related set of database objects and data that are replicated as a
unit.

Subscription
A subscription is a request for a copy of a publication to be delivered to a Subscriber. The subscription defines what
publication will be received, where, and when. There are two types of subscriptions: push and pull.

Types of Replication Overview


Microsoft SQL Server 2005 provides the following types of replication for use in distributed applications:

• Transactional replication.

• Merge replication.

• Snapshot replication.

The type of replication you choose for an application depends on many factors, including the physical replication
environment, the type and quantity of data to be replicated, and whether the data is updated at the Subscriber.
The physical environment includes the number and location of computers involved in replication and whether these
computers are clients (workstations, laptops, or handheld devices) or servers.

Each type of replication typically begins with an initial synchronization of the published objects between the
Publisher and Subscribers. This initial synchronization can be performed by replication with a snapshot, which is a
copy of all of the objects and data specified by a publication. After the snapshot is created, it is delivered to the
Subscribers. For some applications, snapshot replication is all that is required. For other types of applications, it is
important that subsequent data changes flow to the Subscriber incrementally over time. Some applications also
require that changes flow from the Subscriber back to the Publisher. Transactional replication and merge replication
provide options for these types of applications.

Data changes are not tracked for snapshot replication; each time a snapshot is applied, it completely overwrites
the existing data. Transactional replication tracks changes through the SQL Server transaction log, and merge
replication tracks changes through triggers and metadata tables.
Replication Agents Overview
Replication uses a number of standalone programs, called agents, to carry out the tasks associated with tracking
changes and distributing data. By default, replication agents run as jobs scheduled under SQL Server Agent, and
SQL Server Agent must be running for the jobs to run. Replication agents can also be run from the command line
and by applications that use Replication Management Objects (RMO). Replication agents can be administered from
SQL Server Replication Monitor and SQL Server Management Studio.

SQL Server Agent


SQL Server Agent hosts and schedules the agents used in replication and provides an easy way to run replication
agents. SQL Server Agent also controls and monitors operations outside of replication.

IMP: By default, the SQL Server Agent service is disabled when SQL Server 2005 is installed unless you
explicitly choose to autostart the service during installation.

Snapshot Agent
The Snapshot Agent is typically used with all types of replication. It prepares schema and initial data files of
published tables and other objects, stores the snapshot files, and records information about synchronization in the
distribution database. The Snapshot Agent runs at the Distributor.

Log Reader Agent


The Log Reader Agent is used with transactional replication. It moves transactions marked for replication from the
transaction log on the Publisher to the distribution database. Each database published using transactional
replication has its own Log Reader Agent that runs on the Distributor and connects to the Publisher (the Distributor
can be on the same computer as the Publisher).

Distribution Agent
The Distribution Agent is used with snapshot replication and transactional replication. It applies the initial snapshot
to the Subscriber and moves transactions held in the distribution database to Subscribers. The Distribution Agent
runs at either the Distributor for push subscriptions or at the Subscriber for pull subscriptions.

Merge Agent
The Merge Agent is used with merge replication. It applies the initial snapshot to the Subscriber and moves and
reconciles incremental data changes that occur. Each merge subscription has its own Merge Agent that connects to
both the Publisher and the Subscriber and updates both. The Merge Agent runs at either the Distributor for push
subscriptions or the Subscriber for pull subscriptions. By default, the Merge Agent uploads changes from the
Subscriber to the Publisher and then downloads changes from the Publisher to the Subscriber.

Queue Reader Agent


The Queue Reader Agent is used with transactional replication with the queued updating option. The agent runs at
the Distributor and moves changes made at the Subscriber back to the Publisher. Unlike the Distribution Agent and
the Merge Agent, only one instance of the Queue Reader Agent exists to service all Publishers and publications for
a given distribution database.

Replication Maintenance Jobs


Replication has a number of maintenance jobs that perform scheduled and on-demand maintenance.
Implementing Replication
This section provides information about the stages involved in implementing replication.

The process of implementing replication is different depending on the type of replication and the options you
choose, but, in general, replication is composed of the following stages:

• Configuring replication and publishing data

• Creating and initializing subscriptions

• Synchronizing data

This topic provides information on each step, with links to more detailed descriptions. In addition to understanding
the steps required to configure replication, it is important to understand considerations for:

• Implementation.

• Security.

• Performance.

• Backup and restore.

Configuring Replication and Publishing Data


Replication deployment begins when you configure the Publisher and Distributor. The Distributor plays a primary
role in transactional replication; however it plays a more limited role in merge and snapshot replication, being used
only for agent history and error reporting and monitoring purposes. Merge replication and snapshot replication
typically use a Distributor that runs on the same computer as the Publisher, whereas transactional replication might
use a remote Distributor, particularly if the Publisher is a high throughput OLTP system.

After the Publisher and Distributor are configured, you can create publications based on data, subsets of data, and
database objects. When you create a publication, you determine:

• The data and database objects that you want to replicate.

• Which type of replication you will use and which replication options, including filtering.

• Where the snapshot files will be stored and when the initial synchronization will occur, unless you are

delivering the initial dataset manually.

• Other properties to set for the publication.

Depending on the type of replication and the options you chose when configuring the publication, the Subscriber
might be able to modify data after the initial dataset has been delivered and propagate these data changes to the
Publisher, which can then propagate the changes to other Subscribers. The following replication types allow
Subscribers to modify replicated data and have those modifications propagated back to the Publisher:
• Merge replication.

• Transactional replication with updatable subscriptions.

• Peer-to-peer transactional replication.

• Bidirectional transactional replication.

Creating and Initializing Subscriptions


After you create a publication, you can create subscriptions and configure additional options.

Whether you choose snapshot replication, transactional replication, or merge replication, replication by default
creates an initial snapshot of the publication schema and data, and then saves it to the snapshot folder location
you specified. After a subscription is created, the initial snapshot is applied based on the schedule you indicated
when creating the publication. You can bypass one or more of the snapshot steps if the Subscriber already has the
initial dataset or you want to apply it manually.

Synchronizing Data
Synchronization is the process of data being propagated between the Publisher and Subscribers after the initial
dataset has been applied at the Subscriber. For snapshot replication, synchronization means reapplying the
snapshot at the Subscriber so that schema and data at the subscription database is consistent with the publication
database. For transactional replication, synchronizing data means that data modifications, such as inserts, updates,
and deletes, are distributed between the Publisher and Subscribers (and from Subscribers back to the Publisher in
the case of updating subscriptions). For merge replication, synchronization means that data modifications made at
multiple sites are merged, any conflicts are detected and resolved, and data eventually converges to the same data
values at all sites.

Configuring Distribution
The Distributor is a server that contains the distribution database, which stores metadata and history data for all
types of replication and transactions for transactional replication. To set up replication, you must configure a
Distributor. Each Publisher can be assigned to only a single Distributor instance, but multiple publishers can share a
Distributor. The Distributor uses these additional resources on the server where it is located:

• Additional disk space if the snapshot files for the publication are stored on the Distributor (which they

typically are).

• Additional disk space to store the distribution database.

• Additional processor usage by replication agents for push subscriptions running on the Distributor.

The server you select as the Distributor should have adequate disk space and processor power to support
replication and any other activities on that server. When you configure the Distributor, you specify the following:

• A snapshot folder, which is used, by default, for all Publishers that use this Distributor. Ensure that this

folder is already shared and has the appropriate permissions set.

• A name and file locations for the distribution database. The distribution database cannot be renamed after

it is created. To use a different name for the database, you must disable distribution and reconfigure it.
• Any Publishers authorized to use the Distributor. If you specify Publishers other than the instance on which

the Distributor runs, you must also specify a password for the connections the Publishers make to the
remote Distributor.

For transactional replication, after you configure distribution, we recommend that you:

• Size the distribution database appropriately. Test replication with a typical load for your system to

determine how much space is required to store commands. Ensure the database is large enough to store
commands without having to auto-grow frequently.

• Set the sync with backup option on the distribution database.

Local and Remote Distributors


By default, the Distributor is the same server as the Publisher (a local Distributor), but it can also be a separate
server from the Publisher (a remote Distributor). Typically, you would choose to use a remote Distributor if you
want to:

• Offload processing to another computer if you want minimal impact from replication on the Publisher (for

example, if the Publisher is an OLTP server).

• Configure a centralized Distributor for multiple Publishers.

Remote Distributors are more common in transactional replication than they are in merge replication for two
reasons:

• The Distributor plays a larger role in transactional replication because all replicated transactions are

written to and read from the distribution database.

• Merge replication topologies typically use pull subscriptions, so agents run at each Subscriber, rather than

all running at the Distributor. In most cases, you should use a local Distributor for merge replication.