• subscribe
July 19, 2005 12:00 AM

Merge Replication in the Real World

Create a high-availability and disaster-recovery solution that’s easy to configure and support
SQL Server Pro
InstantDoc ID #46828

Configure Distribution and Publishing
Before configuring merge replication, make sure that you have enough disk space on all servers. In addition to space for the database (ClientDB in this case), you'll need space to store elements such as the Distribution database, snapshot files, and backup files. Next, verify that SQL Server in both locations runs under a domain account. The domain account must have access to both servers and have access to the file share that stores snapshot files.

The easiest way to set up the Distributor and configure the Publisher is by using Enterprise Manager. Make sure that both servers (Publisher and Subscriber) are registered in Enterprise Manager. In Enterprise Manager's left pane, click Databases for the server you want to configure as Distributor (SQLServer1 in this case). On the toolbar, click Tools, select Replication, then select Configure Publishing, Subscribers, Distribution. If replication isn't already configured, the Replication Wizard appears. Click Next, and the Select Distributor screen appears. (If replication is already configured, you will see the Publisher and Distributor Properties screen.) Select Make SQLServer1 its own Distributor, then click Next.

If it appears that SQL Server Agent isn't running, select Yes, configure SQL Server Agent to start automatically on the Configure SQL Server Agent screen, then click Next. If SQL Server Agent is already running, you'll skip this configuration step. In the Specify Snapshot Folder dialog box, confirm the default selection (or choose your own) for the snapshot folder and click Next. If a message appears allowing you to confirm your folder choice, do so.

Next, on the Customize the Configuration screen, select Yes, let me to manually configure the distribution database properties, then click Next. Provide the location of data and log files for the distribution database, then click Next. If possible, use different drives for data and log files. I recommend that you keep the default name for distribution database.

On the Enable Publishers screen, select only the required server (SQLServer1). Close the message if one appears. If the Publisher Properties screen appears, leave the default setting (use trusted connection) and clear the option that requests a password to connect the Publisher with the Distributor. Click OK to leave the screen and return to the Enable Publishers screen. If a message appears, click Yes to accept, then click Next.

On the Enable Publication Databases screen, select Merge for your database (ClientDB in this case), then click Next. On the Enable Subscribers screen, select your database (SQLServer2 in this case), click Next, then click Finish. Close the message window that describes Replication Monitor functionality.

Prepare for the Initial Data Synchronization (Snapshot)
After configuring foreign-key and trigger properties for my client's solution, I prepared to create the snapshot. Merge replication requires that you take an initial snapshot for all replicated tables. During the initial snapshot SQL Server adds merge triggers to each published table and adds a uniqueidentifier column to each table that doesn't already have one. SQL Server also creates a new index on the uniqueidentifier column and creates conflict and other system tables.

Merge replication adds a uniqueidentifier column with the ROWGUIDCOL property and the default value newid() to each published table and creates an index on the column. For large tables, this process can be time-consuming. It's faster to add new columns to already populated large tables, then run the initial snapshot for the publication with the affected tables. (For details, see the Microsoft article "HOW TO: Manually Synchronize Replication Subscriptions by Using Backup or Restore" at http://support.microsoft.com/kb/320499/en-us.)

In my case, data in the ClientDB database was disproportionately spread among almost 200 tables that needed replication. The majority (85 percent) of database space was concentrated in the BatchTrans table, which had more than 53 million records. The next largest table had about 5 million rows.

Table 1 shows how the data was spread among the tables in ClientDB. For my client, I wrote a DTS package that prepared structural changes in the seven largest ClientDB tables (each with more than a million rows). I ran this package only once. Even if new initial snapshots were required later, running the DTS package would be unnecessary because the tables would already have the required uniqueidentifier columns.

Configuring Publications and Generating the Initial Snapshot
After configuring the distributor and pulishers, it was time to generate the initial snapshot. To increase the performance of the Snapshot Agent for large publications (or for publications that have large tables), I recommend creating a new profile for the Snapshot Agent and using it instead of the default one. Figure 4 shows the process for creating a new Snapshot Agent profile.

In Enterprise Manger, select Replication on the Publisher (Distributor) server, then right-click and choose Configure Publishing, Subscribers, Distribution. On the Distributor tab, click Agent Profiles, select the Snapshot tab, and click New Profile. In this case, I changed values for a few parameters that affect the performance of Snapshot Agent. On the Replication Agent Profile Details screen, enter the values that Table 2 shows (or tune these values for your own environment as I did here; I tested these changes in the client's development environment and choose the combination that gave me the fastest time). When you've changed the values appropriately, click OK. Then, on the Agent Profiles screen, select the Snapshot_Speed profile and make it a default profile by selecting Change all existing Snapshot Agents to use the selected profile option. Click OK twice to finish the configuration.

At this point I was ready to configure publications. I decided to create 12 publications: 4 different publications that combine groups of related tables and 8 others in which I've combined nonrelated tables based on their business functions, frequency of changes, size, and other factors. By using several publications, you can configure each Merge Agent to run independently, on its own schedule, using different threads, which speeds up the replication process. You do need to balance the number of agents running with available resources. The sidebar "Steps for Configuring Publications," page 31, gives you the process you need to go through to configure each publication you create.

After generating snapshots for all publications, I made a backup of the ClientDB database on the Publisher server, then copied backup files to the Subscriber server and restored the database there.

Creating a New Merge Agent Profile and Configuring Subscriptions
To increase the performance of the Merge Agent and minimize the effect of nonconvergence when SQL Server processes child and parent records in separate generation batches, I decided to create a new profile for the Merge Agent. (To read about the problem of nonconvergence, see the Microsoft article "PRB: Non-Convergence When SQL Server Processes Child and Parent Generations in Separate Generation Batches" at http://support.microsoft.com/default.aspx?scid=kb;en-us;308266.) To create a new profile for the Merge Agent, start in Enterprise Manager and select the Replication node on the Publisher (Distribution) server. Then, right-click and choose Configure Publishing, Subscribers, Distribution. On the Distributor tab, click Agent Profiles, and on the Merge tab, click New Profile.

As Figure 5 shows, on the Replication Agent Profile Details screen, enter the values that Table 3 shows, then click OK and confirm changes if a message appears. On the Agent Profiles screen, select the Merge_Speed profile and make it a default profile by selecting Change all existing Snapshot Agents to use the selected profile. Click OK twice to finish the configuration.

Next, I configured Subscriptions. Starting in Enterprise Manager, select the Replication node of the Subscriber server, select Publications, right-click on a publication, and select Push New Subscription. Click Next, and in the Enabled Subscribers section of the Choose Subscribers screen, select the subscriber server (SQLServer2 in this case), and click Next. On the Choose Destination Database screen, verify the database name (ClientDB in this case), then click Next. On the Set Merge Agent Schedule screen, select Continuously (important to note that this is not a default option), and click Next.

When you see the Initialize Subscription screen, select No, the Subscriber already has the schema and data (again, this is not a default option) and click Next. You don't need to select this option because you've already restored a backup of the database to the target server. On the Set Subscription Priority screen, select Use the Publisher as a proxy, then click Next. Then, on the Start Required Services screen, accept all settings and click Next. Click Finish to complete the wizard, then click Close. In the right pane of Enterprise Manager, you'll see a new line with the newly configured subscription information. You'll need to repeat these steps for all configured publications

Test Merge Replication Before Going Live
Before implementing merge replication in your production environment, I recommend that you test its performance on similar hardware with the expected data load. It's possible that in your particular case, merge replication might not perform well enough to satisfy your specific business requirements. This is true especially in situations that have persistent high data loads. If this is your situation, consider transactional replication or other data transfer scenario. I know of one case when merge replication slowed down the data load on one of the servers by a factor of 2.

However, for small and midsized databases that have evenly spread and a not-very-aggressive transaction volume (as I described earlier), merge replication will usually work fine. Make sure that your user data, log, and TempDB and Distribution files are located on different drives. For additional explanation and advice, read the Microsoft article "SQL Server 2000 Merge Replication Performance Tuning and Optimization" at http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/mergperf.mspx.

This implementation of merge replication satisfied my client's requirements. This solution supports simultaneous data entry on different servers (including modification of text columns) and doesn't require special data partitioning. It avoids many conflicts at the row level by merging changes in different columns into one record. And this solution makes it easy to increase the number of subscribers. This solution doesn't require deep scripting skills, and it can easily be supported by a relatively inexperienced DBA who wants to learn new technology. If your situation is similar to the one I've described here, merge replication is a good choice.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here