How to handle character data in queries
In last month's article, I looked at simple Select statements. This month, I use SQL Server 6.5 examples to show how to handle character data in queries. These examples also work in SQL Server 7.0, with a few differences, which I'll point out as they come up. To run the queries in this article, use the ISQ/W window or the Query Window in the Enterprise Manager in SQL Server 6.5. In SQL Server 7.0, use the Query Analyzer from the program group or call it from within Enterprise Manager.
What Is Character Data?
Loosely defined, character data is any data that's stored as a string of ASCII characters rather than as a numeric value. Many database fields (columns) are character datanames, addresses, job titles, etc. Some fields that appear to be numeric are character data, such as telephone numbers, zip codes, and equipment serial numbers. This datatype makes sense because you don't perform mathematical calculations on such fields. In addition, you might want to mix letters and numbers, for example, to accommodate Canadian postal codes. Also, a zip code such as 00217 is easier to handle as a character field. The integer type drops leading zeros and therefore would truncate a zip code of 00217 to 217.
You specify character data in SQL Server by enclosing the data in single quotes. (SQL Server 7.0 is more forgiving if you make formatting errors than SQL Server 6.5.) For example, the following query for all authors from California in the Pubs database is correct because of the quotes around 'ca':
USE PUBS
SELECT au_fname, au_lname, phone, au_id
FROM authors
WHERE state = 'ca'
If you omit the quotes, you'll see an error message, "Invalid column name ca." Both SQL Server 6.5 and 7.0 will interpret the missing quotes as a request to compare two columns rather than comparing a column to a text string.
A similar error occurs if you try to select data that appears to be numeric, such as zip codes:
USE PUBS
SELECT au_fname, au_lname, phone, au_id
FROM authors
WHERE zip > 90000
In SQL Server 6.5, this comparison fails because, as the error message will tell you, you asked for an implicit conversion. In other words, you asked SQL Server to compare two dissimilar values (a character field, zip code and numeric value, 90000). Although SQL Server 7.0 will run the query if you omit the quotes, I recommend following the ANSI SQL standard and putting 90000 in quotes.
Variable-Length Character Fields
In SQL Server 6.5, you can choose fixed-length columns or variable-length columns for character data. You designate such columns as char(n) or varchar(n), respectively. SQL Server 7.0 offers these choices plus support for Unicode characters, which you designate as Nchar(n) or Nvarchar(n). SQL Server stores character data as a column in a table, and it stores text data in the database, but not as part of the row. The row includes only a pointer to the text data's location. SQL Server 6.5 limits character data to 255 characters and stores additional characters as text data. In SQL Server 7.0, you can use character data for columns up to 8000 bytes long, which is an entire page.
Some character data is fixed-length, such as zip codes or state codes. But for character data such as names, addresses, and product descriptions, you might consider variable-length columns. This approach can save storage space by storing only the bytes required to hold the data, plus one extra byte that tells how many bytes of data are in the column. The down side of this method is that SQL Server first reads the byte that describes how long the data is in the row, then reads that many bytes of data. Thus, reading variable-length data is a two-step process. Data storage is cheap, but users expect better performance than variable-length columns allow. Therefore, many developers use fixed-length columns for speed rather than choosing the more efficient storage option. However, SQL Server 7.0 might change such thinking.
The sort order and, to some extent, the character set you choose when you install SQL Server will affect how data is returned when you query the database. For example, if you choose a binary sort order and list customers by last name, SQL Server returns them in the order Adams, Baker, Cole, Edwards, . . . Zahn, de Lucia, van Zandt. The sort uses the ASCII order of A-Z, then a-z. So customers de Lucia or van Zandt show up after Zahn. The default sort order in SQL Server 7.0 is dictionary sort order, case-insensitive, so SQL Server 7.0 lists the names as you expect to see them: Adams, Baker, Cole, de Lucia, Edwards, . . . van Zandt, . . . Zahn. If your queries show data in a different
order from what you expect, run the system stored procedure SP_HELPSORT to show the settings for sort order and character set.
SQL Server provides many functions for manipulating character data, including the ability to concatenate character data to combine multiple data columns into one output column. For example, you can combine the first and last name fields in a variety of ways. In these examples, I added formatting such as spaces and commas. This example shows first and last names as two separate columns:
SELECT au_fname, au_lname
FROM authors
Prev. page  
[1]
2
3
next page