SideBar    Sidebar: An ORDB Hybrid

OpenQuery doesn't force you to specify connection information, because the linked server already contains all the information you need. You can reference the OpenQuery function, as you can with OpenRowset, in the FROM clause of a query as if the open query function were a table name. You can reference the function as the target table of an INSERT, UPDATE, or DELETE statement, depending on what capabilities the OLE DB provider has exposed. (Like ODBC, the OLE DB specification is large, and different providers will furnish different levels of functionality. Microsoft has not published conformance levels for OLE DB as it has done with ODBC.)

EXAMPLE 4
Example 4 uses OpenQuery to re-create the distributed query in Example 1: As you can see, OpenQuery is easier to read than OpenRowset. OpenQuery runs a little faster because you're using the connection information that you already supplied using sp_addlinkedserver. Figure 2, page 28, shows the syntax for an OpenQuery.

SELECT
RemoteOrders.*
FROM 
OpenQuery(Northwinds, "select * from orders")
RemoteOrders left join MasterOrders on
RemoteOrders.OrderId = MasterOrders.OrderId 
WHERE 
MasterOrders.OrderID = null

You have three choices when deciding how to execute a distributed query. You can create a linked server with sp_addlinkedserver, which makes the underlying tables available. For the ultimate in dynamic flexibility, you can use OpenRowset to execute a passthrough query against a remote OLE DB data source, specifying connection information dynamically. Another choice is to use OpenQuery and pass through a query to an existing linked server.

Stored Procedures and UI
These examples use stored procedure interfaces for managing and executing distributed queries. Table 1 lists stored procedures and their functions.

Has Microsoft tired of easy-to-use database products and scrapped a user interface (UI)? Of course not! Details about the UI are beyond the scope of this article, and being familiar with the underlying stored procedures is valuable knowledge. Stored procedures are invaluable when you're writing scripts and automating tasks, but the UI does exist and will be especially helpful to the novice user.

Security
When you connect to almost any data store, whether it's a database, mail engine, or voice mail system, you must log in and provide user and security information. When using distributed queries, you're connected to the SQL Server system you're issuing queries from, but you must also connect to the remote data, which is often protected. SQL Server has three ways to authenticate you as a downstream distributed query source.

First, SQL Server can leverage the concept of integrated security and pass through Windows NT domain account credentials (for details, see Vicky Launders, "SQL Server 7.0 Security," page 54). This solution lets you use your NT credentials to first connect to SQL Server A, then use the same set of NT credentials to issue a distributed query against SQL Server B. This approach is the easiest and most powerful way to implement security for distributed queries, but you must wait for Windows 2000 (Win2K—formerly NT 5.0) for this support.

Second, SQL Server can authenticate you with the standard security credentials you supplied when logging in to SQL Server. If you log in to SQL Server A as JoeUser and run a query against SQL Server B, you're mapped as JoeUser on that server as well. The first problem with this security type is the user accounts and passwords must exist on both machines. Second, this approach won't work under NT 4.0 if JoeUser is logged in to SQL Server using integrated security.

The third security option to authenticate users is mapped logins that you create and manage with sp_addsrylogin. Such logins are the most practical approach for managing distributed query environments. Until NT 5.0 ships, this approach is the only option available if your users are connecting through integrated security. When a user logs in to the local server and executes a distributed query that accesses a table on the linked server, the local server will log in to the linked server on behalf of the user to access that table. You use sp_addlinkedsrvlogin to map the login credentials that the local server uses to log in to the linked server. You can execute sp_addlinkedserver to automatically create a default mapping between all logins on the local server and remote logins on the linked server. This default mapping states that SQL Server uses the local login's user credentials when connecting to the linked server on behalf of the login (equivalent to executing sp_addlinkedsrvlogin with @useself set to TRUE for the linked server).

Most of the sp_addlinkedsrvlogin arguments are intuitive except for @useself. (Figure 3 shows the syntax for @useself.) Setting @useself to TRUE tells SQL Server to use the current user credentials (username and password) when connecting to the remote server. If this value is TRUE, the system ignores the @rmtuser and @rmtpassword values. BOL says that a value of TRUE for @useself is invalid for an NT authenticated logon unless the NT environment supports security account delegation and the provider supports NT authentication. This approach won't work until Win2K is available, because NT 4.0 does not support security account delegation. Distributed queries and data partitioning will be relatively seamless after NT supports this feature.

Limitations and Other Gotchas
BOL has more than 40 pages of relevant distributed query information; this article covers the basics. I don't have space to discuss all the gotchas, but I want to share some key points.

Don't expect blazingly fast speed when performing cross-server joins. Distributed queries are powerful tools for extending the functionality and reach of SQL Server, but flexibility comes at a price. Current support for distributed queries does not include a full-featured, cost-based, distributed query processor. This situation means SQL Server might need to work a lot harder to join tables on remote servers than when joining the same data sets locally. Network overhead is affected, but the bigger problem is that SQL Server won't always have the same optimization options available when joining large data sets. SQL Server will try to execute as much of the query on the remote server as possible, but the system will inevitably suck large data sets across the network and store them in local temp tables so that SQL Server can process the join. I'd love to provide more details about the potential pitfalls, but the technology is so new that I haven't uncovered most of them yet. I'll explore this topic in a future article.

Portions of a query might execute on both the local and remote servers. Because of this situation, BOL says that distributed queries are always executed as if SET ANSI_DEFAULTS were set to TRUE. Read a little further in BOL and you'll see that the OLE DB provider for SQL Server sets CURSOR_CLOSE_ ON_COMMIT and SET IMPLICIT_TRANSACTIONS to FALSE after the connection is made. This information is spread throughout BOL.

OLE DB providers accessed from a distributed query can support different sort orders and character sets from those the local server supports, and SQL Server will always use the local sort order when processing data sets across servers. Consider the following example from BOL:

SELECT s.OrderId, s.OrderDate
FROM SalesSvr.pubs.dbo.Sales s
ORDER BY s.OrderDate

SQL Server processes the order by using the sorting order installed on the local server; so you need to move the entire data set across the network and sort it locally. Using sp_serveroptions to set the collation compatible option to TRUE tells SQL Server that the remote server sort order and character sets are compatible with the local ones. This information will give the SQL Server query processor more freedom to choose what pieces of a query can be executed on the remote server because comparisons of character data can pass directly to the remote server.

You must upgrade the catalog stored procedures on SQL Server 6.x before it can participate in a distributed query with SQL Server 7.0. See Upgrading the Catalog Stored Procedures (OLE DB) in BOL for information about how to upgrade the stored procedures.

A Little Fun
Distributed queries offer cool functionality in the initial release of SQL Server 7.0, but they have limitations. Performance needs to increase with heterogeneous joins, and I'd love to see more OLE DB providers supplied for Microsoft and third-party vendor products. The current inability to fully leverage NT single logon through integrated security will create headaches when people deploy this technology.

I hope this article helps you understand how important distributed queries and OLE DB technology are to the future of SQL Server. More important, I hope you now have enough information to start writing distributed queries—today.

End of Article

Prev. page     1 2 [3]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Good article .... I have one question ... Does the openrowset execute the stored procedures addlinkedserver, addlinkedsrvlogin, etc ... I need to insert into sql 7 tables from access and the access mdb files will change. I have written a stored procedure that reads a list of tables and can change the linked server info thru the sp_xxx only when the mdb file changes. If I use openrowset and the stored procs are executing behind the scenes then they will run every time which isn't necessary. Any thoughts would be appreciated. Thanks Joe

Joe Finnerty

I don't believe that the following line from the article is correct 'Until NT 5.0 ships, this approach is the only option available if your users are connecting through integrated security. ' It should rather read 'Until NT 5.0 ships, this approach is the only option available if your users are connecting through SQLServer authentication'. Otherwise the writer is contradicting themself in the paragraph below. Using sp_addlinkedsrvlogin with @useself set to TRUE is only valid for SQLServer authentication unless the O/S is Win2000. Further sp_addsrylogin should read sp_addlinkedsrvlogin. The writer should not be addressing 7.0 security as Integrated security. This is a 6.5 feature and they should be discussing in terms of the authentication model. Also, the writer discusses three security options but essentially the first and the third paragraphs are discussing the same ie impersonation.

Paul Williams

Just one question: how can I access e.g. Excel or txt files on remote machines?

Anonymous User

No cacho ni una guea esta caga de arti culo

Anonymous User

Article Rating 1 out of 5

 
 

ADS BY GOOGLE