I was recently discussing database design with one of my friends. We were working on file & file-group design and one of the things that came up almost immediately was the designing of log files (number, initial size, growth factors, etc) for the database.
During this exercise, we noticed that the first transaction log file for the database always got a file_id value of 2. We knew the primary data file always received a file_id = 1 because that’s the first file being created, but if the database has multiple data files, we expected them to be created before the log files were created.
Here’s a reproduction of the tests we did:
Create the databases
Let us first create 3 databases, such that the following configurations are covered:
- One Data, One Log file (simplest configuration)
- One Data, Multiple Log files
- Multiple Data, Multiple Log files
/****************************************************************************** Scenario 01: One Data, One Log file ******************************************************************************/ USE master; GO IF EXISTS (SELECT * FROM master.sys.databases WHERE name = 'OneDataOneLogDB') BEGIN ALTER DATABASE OneDataOneLogDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE OneDataOneLogDB; END; GO CREATE DATABASE [OneDataOneLogDB] ON PRIMARY(NAME = [DB_DEFAULT], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAOneDataOneLogDB.mdf', MAXSIZE = UNLIMITED, FILEGROWTH = 51200 KB) LOG ON (NAME = [DB_LOG], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAOneDataOneLogDB.ldf', MAXSIZE = UNLIMITED, FILEGROWTH = 102400 KB); GO /****************************************************************************** Scenario 02: One Data, Multiple log files ******************************************************************************/ USE master; GO IF EXISTS (SELECT * FROM master.sys.databases WHERE name = 'OneDataMultiLogDB') BEGIN ALTER DATABASE OneDataMultiLogDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE OneDataMultiLogDB; END; GO CREATE DATABASE [OneDataMultiLogDB] ON PRIMARY(NAME = [DB_DEFAULT], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAOneDataMultiLogDB.mdf', MAXSIZE = UNLIMITED, FILEGROWTH = 51200 KB) LOG ON (NAME = [DB_LOG], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAOneDataMultiLogDB.ldf', MAXSIZE = UNLIMITED, FILEGROWTH = 102400 KB), (NAME = [DB_LOG1], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAOneDataMultiLogDB1.ldf', MAXSIZE = UNLIMITED, FILEGROWTH = 102400 KB); GO /****************************************************************************** Scenario 03: Multiple Data, Multiple log files ******************************************************************************/ USE master; GO IF EXISTS (SELECT * FROM master.sys.databases WHERE name = 'MultiDataMultiLogDB') BEGIN ALTER DATABASE MultiDataMultiLogDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE MultiDataMultiLogDB; END; GO CREATE DATABASE [MultiDataMultiLogDB] ON PRIMARY(NAME = [DB_DEFAULT], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAMultiDataMultiLogDB.mdf', MAXSIZE = UNLIMITED, FILEGROWTH = 51200 KB), (NAME = [DB_DATA], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAMultiDataMultiLogDB_Data1.mdf', MAXSIZE = UNLIMITED, FILEGROWTH = 51200 KB) LOG ON (NAME = [DB_LOG], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAMultiDataMultiLogDB.ldf', MAXSIZE = UNLIMITED, FILEGROWTH = 102400 KB), (NAME = [DB_LOG1], FILENAME = 'C:Program FilesMicrosoft SQL ServerMSSQL11.SQL2K12MSSQLDATAMultiDataMultiLogDB1.ldf', MAXSIZE = UNLIMITED, FILEGROWTH = 102400 KB); GO
Select the file_ids of the databases
Now, let us select the file_ids for the files in the database. To do so, we will be using the sys.database_files view.
USE OneDataOneLogDB; GO SELECT 'Scenario 01 - One Data, One Log File' AS Scenario, file_id AS FileId, type AS FileType, name AS FileLogicalName, physical_name AS FilePhysicalName FROM OneDataOneLogDB.sys.database_files; GO USE OneDataMultiLogDB; GO SELECT 'Scenario 02 - One Data, Multiple Log Files' AS Scenario, file_id AS FileId, type AS FileType, name AS FileLogicalName, physical_name AS FilePhysicalName FROM OneDataMultiLogDB.sys.database_files; GO USE MultiDataMultiLogDB; GO SELECT 'Scenario 02 - One Data, Multiple Log Files' AS Scenario, file_id AS FileId, type AS FileType, name AS FileLogicalName, physical_name AS FilePhysicalName FROM MultiDataMultiLogDB.sys.database_files; GO
The results are shown in the image below.
If you observe the high-lighted area, you will notice that the file_id for the primary log file is always “2”. Essentially, the file_ids are assigned in the following manner:
- Primary Data File –> file_id = 1
- Primary Log File –> file_id = 2
- Secondary Data Files (in order of creation) –> file_ids = 3 to n
- Secondary Log Files (in order of creation) –> file_ids = (n+1) to m (where n = file_id of the last data file)
So, what’s going on?
After a bit of thinking, we realized that we were running short of coffee! That’s because this behaviour is obviously working as designed!
All user databases (and the tempdb) are copies of the model database – that’s one of the many the universal truths that Microsoft SQL Server functions on.
The model database comes out-of-the-box with every SQL Server installation with two (2) files – one data, and one log file (having Ids 1 & 2 respectively). That’s why whenever a new user database is created, the primary data file always receives a file_id = 1 and the primary log file always receives a file_id = 2.
References:
- sys.database_files: http://msdn.microsoft.com/en-us/library/ms174397.aspx
- TempDB – Is it a copy of the model database?: http://bit.ly/T6Ikjq
- TempDB v/s Model database – minimum size considerations: http://bit.ly/RMAMPq
Until we meet next time,
Why create multiple log files?
LikeLike
The database that we were working on is supposed to be a “once-in-a-lifetime” staging environment between two systems – an older client-server and a new private cloud based system.
Since large amounts of bulk data was to be imported/merged and other data quality operations to be performed (we were running SQL 2008 R2), we allowed for a huge initial log file spread across multiple drives. The database will obviously take a long time to be created (due to zero-initialization of the log) and because the database is for a one-time use only (with a process designed such that it can be restarted in case of a failure), recovery was not a big concern.
LikeLike
Next time, rather than spreading the logs over multiple “drives”, create a RAID0 stripe. This way you’ll get the benefit of all the drives bandwidth.
* RISK * RAID 0 is not fault-tolerant. If you lose one drive, you lose them all.
LikeLike
Thank-you, Marc! Yes, RAID 0 is indeed useful in such situations where recovery is not the most important, but performance is.
LikeLike