Trimming that hard to reach whitespace

Not all whitespace is created equal, and when LTRIM and RTRIM in T-SQL fails to remove leading and trailing space its easy to stare and panic.  Having worked on a lot of ETL pipelines and legacy databases lately I’ve encountered some pretty nasty data sources and eyebrow-raising moments.

In one particular case, I encountered fields with some 40 characters of whitespace padding which was throwing off some matching on natural key look-ups in my ETL pipeline. Ok, no problem, nothing LTRIM(RTRIM(….)) can’t solve, right?  Well, much to my surprise no trimming took place at all. After a moment reorganizing my eyebrows on my face I went at investigating the problem.

Some very handy T-SQL tools in this case is ACII( ) and UNICODE( ).  Taking a quick look at one of these whitespace characters quickly reveals the problem.  ASCII(RIGHT(wordExpression, 1)) shows the ASCII value of 160, a non-breaking space.  As it turns out, LTRIM and RTRIM only removes ASCII value 32.  

Technically speaking, I got off easy, as the case could have been a bunch of non-printable characters from some of my nasty data sources in which case cleansing my data sources would have been much more tedious.  Luckily, in my case I just need to handle another type of whitespace and this can easily be sorted out with a TRIM function.

About these ads

No comments yet... Be the first to leave a reply!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

%d bloggers like this: