Showing posts with label ASCII. Show all posts
Showing posts with label ASCII. Show all posts

22 June 2010

Querying Extended ASCII Characters in SQL Server

Part of a project requires conversion of ADABAS to SQL Server.  ADABAS hearkens from the day when storage space was a precious commodity, so the "Packed" data types were invented.  These compress the values stored in the column, to maximize storage utilization.

When converting packed ADABAS File fields to SQL Server relational table columns, some of the packed data was not unpacked(?) correctly, resulting in some interesting characters appearing in SQL Server.  The entire data content of a field needs not be packed; ADABAS allows you to leave the first N characters unpacked, and then pack the remainder of the field, and other such options.

It was my job to find all records across the entire database (we're talking millions of records per table) that contain ASCII characters that do not appear on a standard, 108 key, US English, QWERTY keyboard.  Constructing a query that iterates through all tables and columns that are varchar data type is easy.  However, the SQL Management Studio query editors don't display extended ASCII characters.

The solution was pretty simple.  Cast a byte value to a character type, to specify the extended character ranges.

SELECT RecordID
FROM MyTable
WHERE ((patindex('%[' + char(0) + '-' + char(31) + ']%', ColumnName COLLATE Latin1_General_BIN2) <> 0)
      OR (patindex('%[' + char(127) + '-' + char(255) + ']%', ColumnName COLLATE Latin1_General_BIN2) <> 0))

This selects records from the table where the number of extended ASCII (key codes 0-31, and 127-255) characters in a specific column is not 0.