Populate the Repository
Now that you have the list of SQL Server machines, and the repository database is
defined to store information about these
servers for databases, jobs, and backup history, it's time to use SSIS to populate this
simple repository. SSIS might seem like a
complex design environment if you've never
used it. Many DBAs use DTS for SQL
Server 2000 to cut their teeth for extraction,
transformation, and loading (ETL).
Figure 1 shows the full package
that I used to populate the repository database. This package consists of three areas:
(1) migrating and/or truncating repository
tables to maintain the repository, (2) populating a variable with an ADO record set of
server names derived from the commandline utility Sqlcmd, and (3) using this variable
to programmatically connect to each server,
one by one, and pull information into the
repository. You can download this solution
from SQL Server Magazine's Web site. (Go
to http://www.sqlmag.com, enter 95385 in
the InstantDoc ID text box, then click the
95385.zip hotlink.) The solution is called
DBA_Server_Load and the SSIS package it contains is called Populate_DBA_Rep.dtsx.
You can use Business Intelligence Development Studio (BIDS) to open the solution
and package.
Truncating tables and migrating data to
archive tables occur first in the SSIS package.
The Execute SQL Task objects that run
the Truncate Table statements are grouped
together in a sequence container at the top of the package. When the package runs, all the tables are initially truncated; the only exception is the Job stable. Before truncating the Jobs table, a Data
Flow task moves the data
from the Jobs table to the Jobs_Archive table. I wanted to maintain a history of job successes and failures to analyze over time. The other tables
need the most current data—and in my opinion, starting fresh each time
for this semi-static information is cleaner. As Mentioned previously, I'm pulling 5 days' worth
of backup history that will repopulate with each
run. Figure 2 shows the
dialog box to configure the Execute SQL Task object to truncate the SQL_Servers table. When all the objects in the Truncate Tables and Populate
Archives sequence container complete successfully, the package moves to the second phase, which is to populate a
variable with an ADO record set.
Before I explain how
to populate a variable
from a SQL Server query
to use within an SSIS
package, let me explain
why you might want to
do so. If you have fewer than 10 servers, creating a separate connection to each server might be feasible. In
SSIS you'd have 10 Connection Managers,
each pointing to one SQL Server machine.
More than 10 servers is problematic, but a
tenacious DBA might be willing to create
separate connections for, say, 20 servers—as
long as no additional servers are expected, or the DBA is willing to manually add servers
and maintain the package indefinitely. In
my case, I had more than 100 servers, so I
needed a better solution.
The task to populate the variable uses
a simple SELECT statement to query the
ServerList_SSIS table containing the server
names previously derived from Sqlcmd. The
query is:
SELECT RTRIM(Server) AS
<servername>
FROM
ServerList_SSIS
Figure 3 shows the
dialog box to configure the Execute
SQL Task object
for the Populate
ADO Variable
task, including the
Connection, SQLSourceType, and
SQLStatement property values. A direct
input query to the
DBA_Rep database
resides on the local
SQL Server machine
named QASRV,
which is also where
the package will run.
To use the query
results to populate
a variable, you need
to have a variable
already set up.
For my solution,
I needed to set up
two variables: one
for the Execute SQL
Task object, and one
for the final third of
the package, which
uses Foreach Loop
container objects.The five Foreach
Loop containers (i.e., one for each
database table in the
DBA_Rep database)
employ a combination of techniques to meet the defined goal
of programmatically changing a connection
string iteratively for each server and executing
a Data Flow object to retrieve server-specific
information.
To open the Variables toolbar, right-click
anywhere in a blank area of the package
and select Variables from the list. Figure 4 shows the Variables toolbar with two defined
variables of two different data types. The
first, SRV_Conn, is a simple string variable.
The second, SQL_RS, is an object data type. These distinctions are important. Because
the result set from the SELECT statement
contains multiple records, a string variable
doesn't work. I needed to use the SQL_RS
object with a value of System. Object to
hold the results, then map the two variables,
object to string, in the Foreach Loop container. I used the following four simple steps
to accomplish this task.
- Under Result Set in the Populate ADO
Variable task, set the Result Name to 0 and
the Variable Name to User::SQL_RS, as Figure 5 shows.
- In each Foreach Loop container, set
the enumerator in the Collection area to
Foreach ADO Enumerator, and set the
ADO object source variable to User::
SQL_RS, as Figure 6 shows.
- Under Variable Mappings in the
Foreach Loop containers, set the Variable to
User::SRV_Conn and the Index value to 0,
as Figure 7 shows.
- Use a property expression in the Connection Manager object to assign the string
variable to one Connection Manager. When
the Foreach Loop container executes the Data Flow tasks it contains, the connection
string is dynamically built with each enumeration of servername. In this package, the
Connection Manager called MultiServer
serves this purpose. Setting the variable to
the ServerName property, as Figure 8 shows, causes the connections to set
themselves correctly for each server.
After the variable mappings are in place,
you can use Data Flow objects within each
Foreach Loop container to load the tables.
You can place Data Flow objects on the SSIS
package's Control Flow tab, but these special
objects have their own tab on which you can
define their properties and sequencing. In
general, a Data Flow task consists of a source
and a destination object. In my solution,
both the source and destination are OLE
DB connections to a SQL Server machine. I
configured the source to use the MultiServer
connection that would enumerate through
the list of servers, and I configured the local
DBA_Rep connection as the destination
to hold the data. Source and destination
columns are mapped together. The source
can be an object, such as a table or view, or
as in my package, it can be a SQL query to
be used as a derived table.
For four of the five tables, I sent one
query to select values and loaded the results
from the remote servers into the local
DBA_Rep database. To examine the tables'
source queries, right-click the Data Flow
object within the Foreach Loop container
and select Edit. Then, on the Data Flow
table, right-click the OLE Data Source
object and select Edit again to display the
source query. Web Listing 2 contains the
query to load the Databases table.
I needed to handle the Database_Info
table, which loads detailed information
about each database, a bit differently. Because
the SQL Server 2000 Master database stores
basic information about each database in
the sysdatabases tables, and each database stores the remaining important information,
I needed to query each database individually. I could have used the stored procedure
sp_MSForEachDB, which has been available
since SQL Server 7.0, to easily accomplish
this task. However, using this stored procedure doesn't return a solitary result set.
I needed a full result set, so I considered
other alternatives. Using temp tables or table
variables would have given me the full result
set I needed, but setting up and maintaining
temp tables is difficult and requires special
considerations. For example, you need to
create the temp table beforehand, and you
must set a value to retain the connection.
Prev. page
1
[2]
3
next page