Today’s post is a very simple tip based on a question I received from one of my colleagues. They had dates stored in their database and wanted to extract day of the week and day of the year information from the date column.
Now, this would have been a very easy task for the team had this requirement been related to displaying the day of the week and day of the year information on the UI. But, in this particular case, they had to dump it out to a text file (using SQL Server BCP) for exchanging with a legacy reader-board feeding system (If you are unfamiliar with the term, the display boards on railway stations and airports that inform travellers about the schedule and other stages of the boarding process and the associated data stores are reader-board systems). When they approached me, they told me it was a big problem and were wondering if I could spend a day with them to help them figure things out. When I said it won’t even take more than 10 minutes, they were awestruck!
The DATEPART function
Most of us have worked with T-SQL date manipulation functions in Microsoft SQL Server, with the most common use being extracting parts like date, month and year from the date or a date/time value. However, there are many more options, which are what I shared with the team.
Besides extracting parts of a date, the DATEPART function can also help us determine the following for a given date or date/time value:
- The Day of the Week
- The Day of the Year
- Week number
- ISO Week Number
Here’s the associated query:
USE tempdb GO DECLARE @currentDate DATE = GETDATE() SELECT @currentDate AS CurrentDate, DATEPART(DAYOFYEAR, @currentDate) AS DayOftheYear, DATEPART(WEEKDAY , @currentDate) AS DayOftheWeek, DATEPART(WEEK , @currentDate) AS WeekNumber, DATEPART(ISO_WEEK , @currentDate) AS ISOWeekNumber GO /* DAYOFYEAR --Equivalent interval identifiers: DY WEEKDAY --Equivalent interval identifiers: DW WEEK --Equivalent interval identifiers: W ISO_WEEK --Equivalent interval identifiers: ISOWK, ISOWW */
For a test value of ‘2012-08-25’, the results are:
All the team had to do was to incorporate the DATEPART function call in their T-SQL queries. There – did it take more than 10 minutes?
If you would like to explore all the options of the DATEPART function, I would redirect you to the Books-On-Line page: http://msdn.microsoft.com/en-us/library/ms174420.aspx
Also, if you would like to read more about how the team exported bulk data from their SQL Server tables into a text/comma-separated file, you may refer my previous posts:
- BCP & Bulk Inserts – Underappreciated features of Microsoft SQL Server
- SSMS – Query Result configuration – Generate comma separated output without BCP or query changes
Until we meet next time,
Good one Nakul! But I see one problem with the Week related expression. Since DATEPART() depends on the value of @@DATEFIRST, so the value you get will vary depending on the value of @@DATEFIRST at the session level. Here is a more concrete way of doing it: [Query to Get Week Start Date, Month Start Date and Year Start Date Based on Input Date]
@Datta: You are absolutely correct – the DATEPART() is dependent on @@DATEFIRST for Week-related expressions. Thank-you for your feedback!
The ANSI Standard version of this is a little different:
::= EXTRACT FROM ::= | ::= TIMEZONE_HOUR | TIMEZONE_MINUTE ::= |
The is the only place SQL has fields in SQL; I wish noobs would quit confusing fields and columns. The function can take a first argument of DATE, YEAR, MONTH, DAY, TIME, HOUR, MINUTE, SECOND. The DATE and TIME are built from the obvious fields. T-SQL does not have teh INTERVAL temporal types yet.
A useful idiom is a report period calendar. It gives a name to a range of dates.
CREATE TABLE Report_Periods (report_name CHAR(10) NOT NULL PRIMARY KEY, report_start_date DATE NOT NULL, report_end_date DATE NOT NULL, CONSTRAINT date_ordering CHECK (report_start_date <= report_end_date), etc);
These report periods can overlap; a fiscal quarter will be contained in the range of its fiscal year. There can be gaps between them; we have to wait a year between each “Annual Going out Of Business Sale!” and there might be long stretches of time without any special sales. But we want to know their ranges so that the table is fairly constant once it is created.
I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL.
Since SQL is a database language, we prefer to do look ups and not calculations. I would build a calendar table with a ISO-8601 week-within-year format column in addition to the usual Common Era date. This format is 'yyyyWww-d' where yyyy is the year, W is as separator token, ww is (01-53) week number and d is (1-7) day of the week.
You input any calendar date, find the week-within-year column and return the dates that match on a LIKE predicate. WHERE sale_day LIKE '2012W26-'
There are several websites with calendars you can cur & paste, but you can start uyoure serch with: http://www.calendar-365.com/week-number.html
Use a table of time slots set to one more decimal second of precision than your data. You can now use temporal math to add it to a DATE to get a full DATETIME2(0)
CREATE TABLE Timeslots (start_time TIME(1) NOT NULL, (end_time TIME(1) NOT NULL, CHECK (start_time < end_time));
INSERT INTO Timeslots — put data into 15-minute slots VALUES ('00:00:00.0', '00:14:59.9'), ('00:15:00.0', '00:29:59.9'), ('00:30:00.0', '00:44:59.9'), ('00:45:00.0', '01:00:59.9'), .. ('23:45:00.0', '23:59:59.9');
Here is the basic query for rounding down to a time slot.
SELECT CAST (@in_timestamp AS DATE), T.start_time FROM Timeslots AS T WHERE CAST (@in_timestamp AS TIME) BETWEEN T.start_time AND T.end_time;
> I wish noobs would quit confusing
> fields and columns
Good to see you again Joe. As a non-noob (you and I crossed paths on the CompuServ MSACCESS forum more than 10 years ago), could you clarify the difference between a field and a column?
A table is made up of columns. A column belongs to a table.
A field is a columns’ value within a specific row. A field and a tuple are interchangable.
@jcelko, @Marc Jellinek: Thank-you for taking the time out and sharing your feedback on my posts! They are very informative, and I always learn something new!
In 25 words or less it is “Logical versus Physical”, but it goes beyond that. If you do not have a good conceptual model, you hit a ceiling and cannot get past a certain level of competency.
Rows are not records; fields are not columns; tables are not files; there is no sequential access or ordering in an RDBMS, so “first”, “next” and “last” are totally meaningless.
A file is made up of records, records are made up of fields. Files are independent of each other, while tables in a database are interrelated. You open an entire database, not tables; you open individual files. The original idea of a database was to collect data in a way that avoided redundant data in too many files and not have it dependent on a particular programmng language.
A file is ordered and can be accessed by a PHYSICAL location, while a table is not. “first record”, “last record”, and “next n records” make sense in a file but not in a table. Look at how many times we get a posting where someone wants to sequentially read a table. In a procedural language, “READ a,b,c FROM FileX;” does not give the same results as “READ b,c,a FROM FileX;” and some languages will let you write “READ a,a,a FROM FileX;”
A file is usually associated with a particular language — ever try to read a Fortran file with a Cobol program? A database is language independent; the internal SQL data types are converted into host language data types.
A field exists only because of the program reading it; a column exists because it is in a table in a database. Columns have DEFAULT, COMPUTE, REFERENCES CHECK(), NOT NULL, etc to ensure data integrity; fields are passive.
A field is fixed or variable length, can repeated with an OCCURS in Cobol, etc. It does not have to be a scalar value like a column. A field can change data types (union in ‘C’, VARIANT in Pascal, REDEFINES in Cobol); a column cannot.You have no idea whatsoever how0 a column is represented internally; you never see it. SQL is more abstract than a file system; you think of a number as a NUMBER, and not as NUMERAL.
I wish that more programmers had worked with a magnetic tape system. It is based on physical contigous storage for everything. This is where the idea of a primary key came from. Everything in a tape system depends the files being sorted on the same key, so you can merge data.
It was awhile before Dr. Codd changed his mind and said that all keys are keys, and we don’t need a special one in a relational database, like you did in a tape system.
Codd defined a row as a representation of a fact. A record is usually a combination of a lot of facts. That is, we don’t normalize a file; you stuff data into it and hope that you have everything you need for an application.
Rows and columns have constraints. Records and fields can have anything in them and often do!! Talk to anyone who has tried to build a data warehouse about that …
The other “un-learning” that I see when I teach an SQL class is with people who learned spreadsheets. They think that tables are made of rows and columns; no, tables are made of rows and rows are made of columns. Columns hold values, not values or formulas.