After you select the provider or adapter, the dialog box controls you see will be based on your selection. Notice also that the flat-file format I've specified is Ragged Right, which tells the flat-file adapter how it should parse the source file. Ragged-right support is new in DTS 2005 and is the same as fixed-width format, with one exception: Each row is delimited with a CR/LF combination. The other formats available are fixed-width and delimited. Fixed-width format means that the data is aligned into columns whose width is the same for all rows. Delimited data means that the columns have some delimiter such as a semicolon, tab, or comma; comma-separated value (CSV) files are an example of delimited data.
Because the example file includes two header rows, one containing the column names and the other containing dashes, I could set the Header rows to skip option to 2 and ignore the first two column-header rows. But I selected the Column names in the first data row check box instead. This selection isn't necessary, but it lets the adapter pick up the column names, which might be useful later if I want to edit the package in the DTS Designer. In addition, working with real column names rather than default names such as Column1 and Column2 is always nice. In a moment, I explain how to handle the second row of dashes in the sample data.
The Locale drop-down box lets you select the locale the source connection and adapter will use. The locale setting affects the way DTS converts the data from strings. For example, DTS converts date strings to date representations that are locale-specific. So DTS would convert the date Monday, May 17, 2004 to the short-date format 17/5/04 in the English/New Zealand locale but to 04/5/17 in the English/South Africa locale. By default, the wizard selects the locale of the machine on which the wizard is running. If you change the locale, the wizard tries to make life a little easier and selects the default code page for that locale. The code page isn't bound by the locale setting, however, and you can select any code page. As Figure 1 shows, I selected the Unicode setting, so the Code page drop-down box is disabled.
You can use the Header row delimiter setting to work with flat-file formats that differentiate between header rows and data rows. These file types use one character to delimit the header row and another to delimit data rows. The reasons to differentiate between header and data rows are varied; for example, some legacy bulk-insert systems skip a row if the header isn't delimited differently from the data rows. If your flat file is formatted this way, the new wizard supports it.
The Text qualifier box is useful when the source file is delimited. For example, if your data source is a comma-separated file and you specify a double quote (") as a text qualifier, the wizard will ignore commas in text inside double quotes. So in the string "15, rue Descartes", the wizard won't interpret the comma as a delimiter. Text qualifiers are typically double or single quotes but can be user-defined as well.
Once you set the source file and format, you can set up the column widths. Click the Columns node in the tree in the left pane of Figure 1's window to open the Columns dialog box that Figure 2 shows. Because I selected ragged-right format, I need to define the width of each column. If the source file were delimited, the wizard would automatically detect the column widths for you. Clicking the ruler control creates a marker that lets you choose where the columns start. You can remove the column marker by double-clicking the marker or right-clicking it and selecting Remove from the resulting context menu.
The next step is to specify the column properties. After you set column widths, click the Column Properties node in the tree in the left pane to open the dialog box that Figure 3 shows. Here, you can set type-specific properties to ensure each column is converted correctly. For example, if a column contains dates, you'd specify the data type DT_DATE for that column. The wizard automatically adds column definitions based on the width settings that Figure 2 shows. However, if for some reason you need to modify any column property manually, you can use the New and Delete buttons to do so.
At the bottom of the Column Properties dialog box is the Suggest Types button. Clicking it brings up the Suggest Column Type dialog box. This feature does some smart processing to determine the type of each column in the source flat file. You should use this feature to automatically assign column types because it's simple, fast, and usually correct. Assigning the appropriate column types makes it easier to modify the resulting package and optimizes the resulting data flow by assigning the smallest appropriate types to the rows.
After setting the column properties, you're ready to look at the data preview that Figure 4, page 19, shows. For our example data, everything looks accurateexcept that the one row with dashes in it below the column names is gone. I told the wizard to skip that row by setting Data rows to skip to 1. Other than the rows-to-skip setting, this final Choose a Data Source window is mostly designed to let you verify that all the settings are correct. This window, new in DTS 2005, lets you scroll through all the columns and look for problems such as columns that are too narrow or invalid rows.
Prev. page
1
[2]
3
next page