| Executive Summary:If you work with SQL Server Integration Services (SSIS), at some point you’ll probably need to access data from a MySQL database and integrate it into Microsoft SQL Server. Here’s an easy-to-follow solution that uses SSIS, Business Intelligence Development Studio, MySQL connectors, and SQL and SQL Server (T-SQL) scripts to retrieve data from two MySQL tables and insert that data into corresponding SQL Server tables. By performing the steps in the solution, you’ll become familiar with retrieving data from a MySQL database and some of the issues you might have to address when you retrieve that data.. |
As you work with SQL Server Integration
Services (SSIS), sooner or later
you’ll have to retrieve data from
a MySQL database. Fortunately, MySQL provides
drivers that make the process relatively easy. And once
you learn the basics of how to use those drivers, you’ll
be able to integrate MySQL data almost as easily as
SQL Server data. To help you get familiar with using
SSIS with MySQL, I’ll show you an SSIS solution—
MySQL_SQLServer—that retrieves data from two
tables in a MySQL database and inserts that data
into comparable tables in a SQL Server database. In
our discussion here, I assume that you already know
how to develop an SSIS package in Business Intelligence
Development Studio, the pared-down version
of Microsoft Visual Studio 2008 that ships with SQL
Server 2005. I also assume that you’re familiar with
how to define connection managers, control flow,
data-flow elements, and Windows ODBC data sources.
You can find information about MySQL at www.mysql.com. To download the generally available edition of
MySQL, go to the Developer Zone.
Step 1: Connecting to MySQL
The first step in creating an SSIS solution that connects
to a MySQL database is to install the driver needed to
connect to that database. MySQL provides a number
of drivers (called connectors) that let you connect to a
MySQL database. To connect from SSIS, you can use
the Connector/Net driver or the Connector/ODBC
driver. The Connector/Net driver provides an ADO
.NET interface to the MySQL database, and, as the
name suggests, the Connector/ODBC driver provides
an ODBC interface. In the MySQL_SQLServer solution,
I demonstrate both connectors.
You can download the two connectors at www.mysql.com/products/connector. The site provides
information about downloading and installing the
connectors and points you to other references that
explain how to use the connectors. I encourage you
to review these materials before installing either connector.
Because the site so thoroughly documents each
driver, I won’t go into the driver details here. However,
to create the MySQL_SQLServer solution, you’ll need
to download and install both connectors. Follow the
instructions on the MySQL website, and for both connectors,
perform a default installation.
To use the Connector/ODBC driver, you must
create a Windows ODBC data source after you’ve
installed the driver. The data source should point to a
MySQL database. For this article, I’m using the Bookstore
database, which contains two tables, Authors and
Books. I wrote an SQL script, MySQL_BookstoreDB,
to create the database and populate the tables with a
few initial rows; you can download this code, as well
as the other files for the MySQL_SQLServer
solution by clicking on the Download the Code link at
the top of the article. You’ll need to run this
script against a MySQL installation before
you can implement the solution. Note that, for this
solution, I installed MySQL 5.0 (a default installation)
and SQL Server 2005 Developer Edition on the same
Windows XP computer.
After you create the Bookstore database in MySQL
and install the Connector/ODBC driver, you can create
the Windows ODBC data source. To support my solution,
I created a system Data Source Name (DSN)
ODBC data source called mysql and selected MySQL
ODBC version Driver (aka the Connector/ODBC
driver; the driver is displayed with this name after it’s
installed) from a list of existing drivers. Note that if
you installed MySQL on a port other than the default,
you’ll need to enter the correct port on the Connect
Options tab in the Connector/ODBC Configure
Data Source Name dialog box. For more information
about creating an ODBC data source, see Windows
Help. (You don’t have to perform any steps to use the Connector/Net driver; it’s ready to go as soon as you
install it.)
Step 2: Defining Connection
Managers and Control Flow
Now that you’ve set up your drivers and ODBC
data source, you can create your SSIS solution. You
can create the solution in either Business Intelligence Development Studio or the full version of Visual Studio
2005 or later. (Use the Integration Services Project
template to create your SSIS package.) When creating the SSIS package, you should start by defining
the connection managers. But before you do
so, run the SQLServer_BookstoreDB script,
which you can also download through the hotlink at the top of this page, to create the target database (Bookstore)
in SQL Server. Then create a connection
manager to the Bookstore database.
To test both MySQL drivers, you need
to create a connection manager for each
one. Let’s start by creating the connection
manager for the Connector/ODBC driver or
more specifically, for the mysql DSN. Intuitively,
you’d think that you simply create an
ODBC connection manager. However, this
approach doesn’t work. Instead, you must use
a Microsoft .NET Framework ODBC data
provider, which is an ADO.NET connection
type, to create the connection manager. To
create an ADO.NET connection, right-click
the Connection Managers pane in Business
Intelligence Development Studio, then click
New ADO.NET connection. In the Configure
ADO.NET Connection Manager dialog box, click
New. In the Connection Manager dialog box, select
.Net Providers\Odbc Data Provider from the Provider
drop-down list. Also select mysql as the system DSN.
Because the mysql DSN includes the username and
password, you don’t need to specify it here.
Now create the connection manager for the
Connector/Net driver. Once again, you create an ADO
.NET connection. However, this time you should use
the provider .Net Providers\MySQL Data Provider. In
the Connection Manager dialog box for this provider,
you must enter the name of the MySQL database,
server, user, and password. In our example, I specified
Bookstore as the database and localhost as the server. In
addition, I used the root user account to access MySQL,
but typically you’d use a more restrictive account. Note
that after you enter the password and close the dialog
box, the password is no longer displayed.
The next step in creating the SSIS solution is to
add two Data Flow tasks to the control flow, one for
each MySQL driver. Each task will retrieve data from
one of the tables in the MySQL database. For this
solution, I used the Connector/Net driver to retrieve
data from the Books table and the Connector/ODBC
driver to retrieve data from the Authors table. After
you add the Data Flow tasks, you can view them
along with the connection managers you configured
(MySQL_Net and MySQL_ODBC, in our example)
on the Control Flow tab in SSIS Designer.
Step 3: Configuring the Books
Data Flow
After you add the Data Flow tasks to the control flow,
you can begin to configure the Books data flow. Start by adding a DataReader source component. To
do so, open the component’s editor. Then, on
the editor’s Connection Managers tab, specify
the connection manager for the Connector/Net
driver. In my solution, I named the connector
MySQL_Net.
Next, on the Component Properties tab, add a
SELECT statement to the SqlCommand property
to retrieve the data from the Books table. I used
the statement
select * from Books;
I left all other properties with their default values,
as I did on the Column Mappings and Input and
Output Properties tabs. However, there are a few
items on the Input and Output Properties tab that
are important to note.
If you expand the Input and Output Properties
tab’s DataReader Output tree, as Figure 1 shows,
you’ll see a list of input (external) and output
columns. Look at the BookID output column. As
you can see, the data type assigned to this column
is a four-byte unsigned integer, which is consistent
with the column created in MySQL. (An unsigned
integer is one that doesn’t support negative numbers.)
However, the BookID column in SQL Server won’t
accept the value in this format, which means that you’ll
need to convert the column to the correct data type.
You’ll run into the same problem with the NumIn-
Stock column, although this is a two-byte unsigned
integer, rather than four-byte.
Now look at the Title output column’s properties.
Notice that the data type is Unicode string. Although
the MySQL column definition doesn’t specify Unicode,
this is how the data is saved. However, this isn’t how
the column was created in SQL Server. Consequently,
you’ll need to convert this column, too. You don’t have
to worry about the DateModified column’s data type.
SSIS retrieves this as a timestamp column, which is
compatible with SQL Server’s DATETIME column.
Because you have columns that are incompatible
between the source and destination, you’ll have to
convert those columns. To do so, your next step is to
add a Data Conversion transformation to the data flow
and connect the source to the transformation.
Continue to page 2