Executive Summary:
Taking advantage of the fact that SQL Server "provides additional ordering guarantees beyond ANSI," the Strim function strips out non-alphanumeric characters (e.g., hyphen, exclamation point) from strings. This T-SQL user-defined function runs on SQL Server 2005, SQL Server 2000, and SQL Server 7.0.
|
SQL Server offers a variety of string functions,
but it doesn’t have a function that strips
out non-alphanumeric characters, such as a hyphen (-),
an exclamation point (!), a percent sign (%), or a semicolon
(;). I created a function named Strim, which is
short for strip trim, to fill this gap.
The Strim function takes advantage of the fact that
SQL Server “provides additional ordering guarantees
beyond ANSI.” Specifically, a variable assignment in a
query such as
SELECT @a = @a + col
FROM Table ORDER BY col2
will be performed in the expected order. For more information
about these guarantees, see the “Ordering Guarantees
in SQL Server 2005” entry in the Tips, Tricks, and Advice from the SQL Server Query Optimization Team
blog (blogs.msdn.com/queryoptteam/archive/2006/05/02/588731.aspx). Note that the blog has since morphed into
the Tips, Tricks, and Advice from the SQL Server Query Processing
Team blog (blogs.msdn.com/sqlqueryprocessing/default.aspx).
Listing 2 shows the Strim function. This function takes
one input parameter (@textin), which is limited to 8,000
characters. Strim first converts the string into a derived
table labeled A. The derived table is cross-joined with a
table of numbers named Tally to parse the string. Listing 3
shows the code that creates the Tally table. The WHERE
clause at callout A in Listing 2 filters the intermediate
output from the cross join. The clause uses ASCII codes to
accept only numbers (ASCII codes 48 through 57), uppercase
letters (ASCII codes 65 through 90), and lowercase
letters (ASCII codes 97 through 122). The function then
converts the intermediate output to a derived table labeled
D. For example, if
@textin=”abcdefg”
table D’s rows would be:
| i |
letter |
| 1 |
a |
| 2 |
b |
| 3 |
c |
| 4 |
d |
| 5 |
e |
| 6 |
f |
| 7 |
g |
Finally, the string is reconstituted as abcdefg, assigned to
@textout, and returned.
For comparison, let’s look at what happens if a string
contains non-alphanumeric characters. If
@textin=”pubs-dev”
table D’s rows would be:
| i |
letter |
| 1 |
p |
| 2 |
u |
| 3 |
b |
| 4 |
s |
| 5 |
d |
| 6 |
e |
| 7 |
v |
and the string would be reconstituted as pubsdev.
I’ve used the Strim function to remove nonvisible
characters in imported data (e.g., IBM EBCDIC data).
Another use involves creating filenames for our database
backup files. Our company uses Idera’s SQL Compliance
Manager, which creates files with filenames such
as SQLcompliance_MyDatabase.SQL.DOMAIN.COM.
Our custom backup script uses the Strim function to strip
out the unwanted characters in the database names when
it creates the filenames for the backup files so that the
filenames follow our standard format of myservername_
myinstancename_SQLcomplianceMyDatabaseSQLDOMAINCOM_
backuptype_yyyymmddhhmmss.bak. The
Strim function works on SQL Server 2005, SQL Server
2000, and SQL Server 7.0.
—Louis Nguyen, DBA, Centex Homes
End of Article