#0404 – SQL Server – Interview Question – What is logical data integrity?


Recently, I encountered an interesting question in one of the forums:

What is logical data integrity?

The person who posted the question was reading about SQL Server and databases in general, when this term was encountered. Because the answer to this question can help clarify one’s understanding of data design  concepts, I thought it would also make a very interesting interview question as well.

Today, I try to describe that data integrity is.

What is data integrity?

Data is a critical part of any business. But, data by itself holds no value. For data to be information of business value, it needs to be valid with respect to the business domain.

A piece of data may be perfectly acceptable from the physical design perspective, but may be still be invalid for the domain.

Let’s take an example – a rate of 2000 is perfectly acceptable for an integer. That is physical data integrity – the value is valid with respect to the physical design of the database. But, if  we are talking  about an application that captures and analyzes patient/medicinal data, the rate of 2000 is totally invalid and indicates some sort of logical bug/corruption.

Other examples would be a meeting end date that’s less than the meeting start date or a business/person without a name.

A data point may not be acceptable within the business rules defined for a domain. Similarly, what’s valid as a data point for one domain may be invalid for another domain. Ensuring that your database only accepts valid values with respect to your domain is what I call logical data integrity”.

Types of Data Integrity

Logical data integrity can be enforced in two ways:

Declarative Data Integrity

If data  integrity is enforced via the data model (implemented via the Data-Definition-Language, i.e. DDL), it is declarative data  integrity. One would enforce declarative integrity via the elements of the table definition:

  • Appropriate Data-Types
    • In our example for the medical domain, it would limit the possibility of corruption if a TINYINT is used to store the heart rate instead of an INT
  • Primary Keys
    • Avoid the insertion of duplicate data!
  • Foreign Keys
    • Ensures that all references are known (it is a valid primary key in another table)
  • Default, Check, Unique and Not-NULL constraints
    • Unique and Not-NULL constraints help maintain uniqueness and avoid insertion of unknown (NULL) data
    • Usage of default constraints ensure that by default unknown (NULL) values are replaced by valid default values
    • Check constraints help ensure that data meets the valid range defined by the business (e.g. a check constraint would help ensure that the meeting end date is greater than or equal to the start date)

Procedural Data Integrity

Legacy applications (I have worked on a few that match this description) which were originally developed in the days of flat-file databases, often used procedural code to enforce data integrity.

When these were migrated to Microsoft SQL Server, the integrity was enforced via stored procedures and triggers to avoid re-engineering the database structure and changing the application code to match the new structure.

Data integrity enforced via code, i.e. via stored procedures, triggers and/or functions is called procedural data integrity.

My take: Procedural code can be disabled, fail or have bugs. This may cause the application code to generate bad/invalid data rather than prevent it.

I believe procedural data integrity is acceptable as long  as it is used as a “fail-safe” mechanism. The primary mechanism to ensure logical data integrity should be declarative in nature, in my humble opinion.

The above is my take on logical data integrity. I welcome your thoughts on the subject in the space below.

Until we meet next time,

Be courteous. Drive responsibly.

#0403 – SQL Server – CAST/CONVERT to string – Pad zeroes or spaces to an integer


Helping the community via forums often leads to some very interesting moments. Recently, I came across quite a common question – as part of a data migration, someone wanted to pad integers with zeroes. There are various variations to this question, namely:

How do I pad zeroes to  convert an integer to a fixed length string?

How do I pad zeroes before an integer?

How to I pad blank spaces before an integer?

All of these questions have quite a simple solution, which I am going to present before you today.

The script demonstrates the process of padding the required values to a set of integers in a test table. The script:

  1. Converts the Integer to a string
  2. Appends this string representation of the integer to the padding string
  3. Finally, returns the required number of characters from the right of the string

For the purposes of this demo, I have shown the result with two padding characters – a zero (0) and an asterisk (*).

Have you ever faced such a requirement as part of a data migration or an integration? Do you use a similar approach? Do share your thoughts and suggestions in the space below.

--Pad zeroes in string representation of a number
USE tempdb;
GO
--Safety Check
IF OBJECT_ID('dbo.TestTable','U') IS NOT NULL
BEGIN
   DROP TABLE dbo.TestTable;
END
GO

--Create the test tables
CREATE TABLE dbo.TestTable
            (RecordId    INT NOT NULL IDENTITY(1,1),
             RecordValue INT     NULL
            );
GO

--Populate some test data
INSERT INTO dbo.TestTable (RecordValue)
VALUES (123),
       (1023),
       (NULL);
GO

/**************** PADDING CHARACTER: ZERO (0) ****************************/

--Change the padding character and the number of strings as required
DECLARE @requiredStringLength INT = 10;
DECLARE @paddingCharacter CHAR(1) = '0'

--The script:
--1. Converts the Integer to a string
--2. Appends this string representation of the integer to the padding string
--3. Finally, returns the required number of characters from the right of the string
SELECT RecordId,
       RecordValue AS OriginalValue,
       RIGHT( (REPLICATE( @paddingCharacter, @requiredStringLength )
              + CAST(RecordValue AS VARCHAR(20))
              ),
              @requiredStringLength
            ) AS PaddedValue
FROM dbo.TestTable AS tt;
GO

/* RESULTS
RecordId    OriginalValue PaddedValue
----------- ------------- ------------
1           123           0000000123
2           1023          0000001023
3           NULL          NULL

*/

/**************** PADDING CHARACTER: ASTERISK (*) ****************************/

--Change the padding character and the number of strings as required
DECLARE @requiredStringLength INT = 10;
DECLARE @paddingCharacter CHAR(1) = '*'

--The script:
--1. Converts the Integer to a string
--2. Appends this string representation of the integer to the padding string
--3. Finally, returns the required number of characters from the right of the string
SELECT RecordId,
       RecordValue AS OriginalValue,
       RIGHT( (REPLICATE( @paddingCharacter, @requiredStringLength )
               + CAST(RecordValue AS VARCHAR(20))
              ),
              @requiredStringLength
            ) AS PaddedValue
FROM dbo.TestTable AS tt;
GO

/* RESULTS
RecordId    OriginalValue PaddedValue
----------- ------------- ------------
1           123           *******123
2           1023          ******1023
3           NULL          NULL
*/

Until we meet next time,

Be courteous. Drive responsibly.