Replicating in Sphinx
Microsoft released beta 1 of SQL Server 7.0 (code-named Sphinx) almost a year ago; beta 2 finally came out in December 1997. Sphinx will be a major product upgrade with tons of new functionality. We'd need a few months of the whole magazine to fully describe all the new features, but we'll focus on specific features until Sphinx ships. This month we'll give you a quick overview of Microsoft's plans for enhancing the much-maligned replication capabilities of SQL Server 6.x.
SQL Server 6.x provides two types of support for distributed data. The distributed transaction coordinator (DTC) supports standard two-phase commit (2PC) that lets you keep data completely synchronized at all times across all your sites. Unfortunately, to take advantage of 2PC, you have to do API-level programming, and this synchronous approach doesn't offer acceptable performance in many production environments. The other model, using SQL Server 6.5's built-in replication, is asynchronous and is based on a publish and subscribe metaphor. Publish and subscribe supports loose data consistency rather than 2PC's tight data consistency (loose consistency is a nice way of saying data can be out of sync).
Customers told Microsoft they wanted more options, and, as you can see in Figure 1, the Sphinx replication model of-fers several options. Sphinx retains the loose consistency model, the publish-and-subscribe (master/slave) model found in SQL Server 6.x. Tight consistency in Sphinx is the 2PC model that the DTC currently provides in SQL Server 6.x. However, improved programming interfaces and new replication wizards make setting up this model much easier.
A new feature is No consistency in the form of merge replication and queued transactions. Merge replication is the familiar replication model used in Microsoft Access. This kind of replication is useful if you have mobile workers (e.g., salespeople) making changes offline from the field. Although merge replication obviously can't guarantee full transactional consistency, it fits many real-world business scenarios. Sphinx and Access 98 are expected to support bidirectional merge replications.
Updating subscribers offers a slight twist on a full 2PC approach. SQL Server uses 2PC to ensure that changes simultaneously commit at the master location and at the subscriber that initiated the update, but normal loose consistency rules apply when you're replicating the change to all other subscribers. This model lets you update data at a remote subscriber but doesn't force all the LAN/WAN links to be up at the same time--just the one between the master site and the subscriber that initiated the change.
The Queued transactions model is similar to Updating subscribers, but instead of using the Microsoft Transaction Server to handle the 2PC protocol, it relies on the new Microsoft Message Queue Server (MSMQ), code-named Falcon, which runs as an NT service and offers connectionless store-and-forward messaging. The Queued transactions model supports the ability to apply transactions at the subscriber and then queue the transactions to be applied at the publisher later. Queued transactions complements the Update subscribers approach because it provides an asynchronous backup if the LAN/WAN link between a subscriber and a publisher is temporarily unavailable.
Using DBCC PSS
We recently stumbled across Database Consistency Checker (DBCC) Process Status Structure (PSS), an undocumented function, that can provide a wealth of information about executing SQL Server processes. DBCC PSS displays information from a SQL Server's PSS, an internal data structure. The PSS stores each connection's state of activity so you can work on the connection for a while, schedule it out, work on it again, and so on. Although Microsoft doesn't document much of the information that DBCC PSS spits out, you can find a few pieces of gold if you know how to sift through the output. The usage is
/*Trace flag 3604 redirects subsequent DBCC output to the client rather than to the error log*/
DBCC TRACEON (3604)
/* This command prints PSS information for a given SPID*/
DBCC PSS (0, SPID, 0)
Listing 2 shows some DBCC PSS output we captured during a recent troubleshooting session.
What do the pstat bits (shown in callout A in Listing 2) tell us? Pstat values aren't fully documented, but Microsoft Knowledge Base article "Q171224, INF: Understanding How the Transact-SQL KILL Command Works" (http://premium.microsoft.com/support/kb/articles/q171/12/24.asp), lists the bit settings Table 1 shows. (This Knowledge Base article also explains why the KILL command won't always work.) The pstat value in my sample output was 0x8, so the connection was executing a trigger.
At B in Listing 2, plastprocid_value gives us the object ID of any stored procedure the current connection is executing. In this case, the pstat bit tells us the connection is executing a trigger, so plastprocid_value is telling us the object ID of the trigger the connection is executing. (This explanation makes sense, because triggers are a special type of stored procedure.) We can determine which procedure or trigger the connection is executing by running the following command:
Select object_name(plastprocid_value)
The third gem we can mine from the PSS output is in pxcb->xcb_xactcnt=3, at C in Listing 2. The value 3 tells us that the @@trancount setting for the connection is 3. Transaction management and @@trancount deserve a separate discussion, but for now, all you need to know is that any value other than 0 reveals that the connection is in the middle of a transaction. In this case, we're three levels deep in a nested transaction.
Here's a real-life example of how you can use this information. We were trying to figure out why a process in a development environment had suddenly become very slow. The process was running a DELETE command that previously had run fast, but suddenly started taking more than an hour to complete. It was still running much faster in a separate database with similar data sets and supposedly duplicate schema definition.
We ran DBCC PSS while the command was executing and noticed that the slow version of the DELETE command was executing a trigger. (This fact was obvious because the pstat bit was listed as 0X8.) However, this trigger wasn't supposed to be there at all--we thought we had dropped it during a recent design change. Getting rid of the trigger solved the problem; finding the trigger and troubleshooting the problem was a breeze because we had access to the internal PSS info.
End of Article
Prev. page
1
[2]
next page -->