String searches in most applications are based on the “contains” or “begins-with” paradigms – which are implemented by the use of the LIKE logical operator. The LIKE operator allows for the use of wildcard characters (underscore being one of them), which is the part that most people unknowingly overlook when they have been working with T-SQL for quite some time.
One such incident happened the other day at work when I was asked by a colleague to help him out with a query that appeared correct, but failed to give the expected results. For the sake of brevity, I have condensed the issue into a test scenario below:
DECLARE @testTbl TABLE (Value VARCHAR(50)); INSERT INTO @testTbl (Value) VALUES ('SQL 2012'), ('SQL_2014'); SELECT * FROM @testTbl WHERE Value LIKE 'SQL_20%';
As can be seen in the screenshot above, the results are not what one would expect them to be at the first glance. One would have expected it to return only the string “SQL_2014” and yet the T-SQL query is returning both the records.
Normally, data coming in via flat-files from raw-data collection systems would require such kinds of string searches and manipulations. A well designed system would have these static values as a look-up so that all references look at the same version of the string value
Solution(s)
There are not one, but two possible solutions to choose from to work around this issue.
Using square brackets ([]) around the wildcard character
Because we want to consider the underscore as a character and not as a wildcard, the easiest thing to do would be to surround it with square brackets, similar to what is shown in the query below.
SELECT * FROM @testTbl WHERE Value LIKE 'SQL[_]20%';
Using the ESCAPE keyword
The other option, which I have discussed in one of my earlier posts, is to use the ESCAPE keyword. The ESCAPE keyword works on the basis of a user specified escape sequence. When encountered, the query engine will simply ignore the wildcard character and treat it as a normal character.
SELECT * FROM @testTbl WHERE Value LIKE 'SQL*_20%' ESCAPE '*';
Conclusion
The moral here is that one must always remember the basics – this thumb rule not only applies to software, but in everything that we go about doing in our day to day lives.
The other key learning that I encourage everyone to take from this is that Occam’s Razor is real – from the multiple explanations that a problem may have, the explanation with the least number of assumptions should be selected as the answer. In this case, the simplest theory was – human error, which was indeed the case.
References
-
Pattern matching in SQL Server [http://msdn.microsoft.com/en-US/library/ms187489(v=sql.105).aspx]
-
Implementing Windows style wildcard search in SQL Server, using the ESCAPE keyword [https://nakulvachhrajani.com/2013/04/22/0259-sql-server-implementing-wildcard-search-escape-keyword/]
-
LIKE Operator [https://msdn.microsoft.com/en-us/library/ms179859.aspx]
- Best Practices – Avoid wild-card characters at the beginning of a search string [https://nakulvachhrajani.com/2013/10/31/0306-sql-server-best-practices-avoid-wild-card-characters-at-the-beginning-of-a-search-string/]
Until we meet next time,
Be courteous. Drive responsibly.