DOWNLOAD THE CODE:
Download the Code 44074.zip

In rare cases, you might have more data on one or more Subscribers than you do on the Publisher. For example, if a user runs a malicious INSERT command on a Subscriber or if a DELETE statement on the Publisher wasn't propagated to Subscribers, you'd need to remove the unnecessary data from the Subscribers. To handle such situations, I added an extra datetime column called repl_time to each article on every Subscriber. This method might sound like it requires a lot of additions, but you'd do it only in rare situations. Just once, before configuring replication, you run the script that callout A in Listing 4 shows, which adds the extra datetime column to each image table (the table that's part of replication) on the Subscriber. You don't need to add the column to the Subscriber's own tables. Every time a record is inserted or updated on the Subscriber, the proper custom stored procedure (either for insert or update) will change the value in the repl_time column to the default value, getdate(). The code at callout B in Listing 4 shows the stored procedures sp_MSins_ Table1 and sp_MSupd_Table1, which run on the Subscriber and change the value of repl_time. (Note that the insert and update stored procedures are different in Listing 4 than they are in Listing 3, but the sp_MSdel_Table1 stored procedure remains the same as it is in Listing 3).

When you run on the Publisher UPDATE statements that replace a value with the same value—such as the preceding UPDATE statement that sets the value for column f1—all Subscriber records that have corresponding records on the Publisher will get new values in their repl_time columns. Subscriber records that don't match any records on the Publisher will keep the old values in this field. To remove any extra records from the Subscriber, you can run a DELETE command that includes the repl_time column in a WHERE clause. For example, if you run forced replication at 11:00 a.m. on August 27, 2004, the following command

DELETE Table1 WHERE repl_time < '2004-08-27 11:00:00'

will remove from the Subscriber the records that have repl_time values before 11:00 am on August 27, 2004.

Note that you can also create the repl_time column on the Publisher and insert default NULL values into that column during the data load. During replication, data in the repl_time column would change only on Subcribers. This technique will keep table structure identical across all servers. For information about how to maintain data integrity when tables are related through foreign key constraints, see the sidebar "Handling Referential Integrity."

Using Forced Replication to Synchronize Large Tables
To implement forced replication, you need only two new stored procedures: one for small (or lookup) tables and another for large tables. Depending on your table's size, you might decide to run the UPDATE command either against selected rows (by using a WHERE clause) or against the entire table (by leaving out the WHERE clause). If your table is large (many thousands of rows), you should run the UPDATE command against a limited number of records to avoid long-running transactions and to decrease the chance of blocking other users.

If your table has columns that allow data partitioning, you can use those columns to replicate (and synchronize) only part of the table. In most cases, you can use a datetime column to filter the data. You can create a stored procedure, such as the sp_ForceRepl stored procedure that Listing 5 shows, to filter records for replication. The sp_ForceRepl stored procedure takes four parameters: TableName, BatchSize, StartDate, and EndDate. Let's take a look how this stored procedure works. First, you tell the code to select into the temporary table #Table1 all primary keys for records that satisfy the criteria you specify in the WHERE clause on the f3 column. In Listing 5, the code is selecting records with values in column f3 between @sStartDate and @sEndDate. Then, you select into the #Table1_Temp temporary table a certain number of primary keys from #Table1, which you specify in the @iBatchSize parameter, and update the records in Table1 that have the keys you specify. At the end of Listing 5's loop, the code removes the processed primary keys from #Table1 and truncates the #Table1_Temp table in preparation for the next cycle. You can create a similar filtering process for any other large table.

Using Forced Replication to Synchronize Small or Lookup Tables
Sometimes, the tables you're working with are relatively small or can't be partitioned, as with lookup tables, which are usually much smaller than operational tables that you can partition by such criteria as ID or time. If you need to replicate a table that can't be partitioned, you can run against all the table's records an UPDATE statement that doesn't change the value in the selected column. Because forced replication implementation for lookups doesn't require the preliminary step of selecting records that satisfy given criteria (as with the sp_ForceRepl stored procedure in Listing 5), you can write a generic stored procedure that works against all lookup tables. Listing 6 shows the code that creates such a stored procedure, sp_ForceRepl_Lookup. Note that Listing 6's code runs one UPDATE command for an entire lookup table instead of breaking the process into several batches, as you would with a large table.

The code in Listing 7 creates in the msdb database a table called ForcedReplTblExclude that stores the names of the large tables from all the publications that you need to process separately by using batches. Then, sp_ForceRepl_Lookup loops through all tables except those that are listed in the ForcedReplTblExclude table, finds the first column in the selected tables' primary keys, and constructs and executes the UPDATE statements that synchronize the data.

Putting It All Together
Now you're ready to implement forced replication. You need to follow several steps.

  1. Verify that all foreign keys on the Publisher are configured with the NOT FOR REPLICATION option.
  2. Add the repl_time column to each table on the Subscriber, and modify the replication stored procedures for INSERT, UPDATE, and DELETE commands for all articles on the Subscriber. (To learn more about the options you can choose when configuring replication, see the Web sidebar "Additional Facts About Configuring Transactional Replication" at InstantDoc ID 44150.)
  3. Create the ForcedReplTblExclude table to hold a list of all non-lookup tables (e.g., those that you can partition by datetime or ID).
  4. Create the sp_ForceRepl stored procedure, which executes for each partitioned table an UPDATE command that replaces a value with the same value.
  5. Create the sp_ForceRepl_Lookup stored procedure to handle forced replication for all other tables.
  6. Create a DTS package, a data link file, and configuration INI file, as the Web sidebar "Creating a DTS Package to Run Forced Replication" (InstantDoc ID 44149) describes.
  7. Create a job that executes the DTS package, but don't schedule this job to run. The DBA should start it manually, whenever you need data synchronization.

Having in your arsenal a tool that implements forced replication will make your life easier. Such a tool lets you quickly synchronize the data at different sites after a disaster has happened—but don't wait for disaster to strike. Use the scripts I provide in this article to implement a forced-replication solution in your development environment. There, you'll see the benefits of the tool because in a development environment, you're likely to have more replication interruptions than in your well-monitored, stable production environment. Having the tool ready for production will give your company a valuable, cost-saving technique for switching database operations to a secondary data center.

End of Article

Prev. page     1 2 [3]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

very though provoking

blueoyester2000

Article Rating 4 out of 5

It looks just like transactional replication under SQL Server 6.5.

Anonymous User

Highly useful. Great Job!

mspalding

Article Rating 5 out of 5

Great article i think.

sanjeev1810

Article Rating 4 out of 5

 
 

ADS BY GOOGLE