#0330 – SQL Server 2012 – Running Total Problem and Windowing Functions


The T-SQL language constantly undergoes modifications. Every release bring in something new, something to help develop an easier solution to a common business problem. With Microsoft SQL Server 2012, enhancements have been incorporated to the Windows functions to allow an easier solution to what is commonly referred to as the “Running Total Problem”.


The whole concept of running totals is that as the transactions continue to be added, the system maintains the total of a given sequence. This allows the system to fetch this summary without having to maintain any record of the individual transactions itself.


Here’s an example based on the transactions for one of the SalesOrders in the AdventureWorks2012 sample database.

















































































SalesOrderId UnitPrice OrderQty Running Total Running Total Formula
43659 2024.994 1 2024.994 (Unit Price * Order Quantity)
43659 2024.994 3 8099.976 Total from previous line + (Unit Price * Order Quantity)
43659 2024.994 1 10124.97 (same as above)
43659 2039.994 1 12164.964 (same as above)
43659 2039.994 1 14204.958 (same as above)
43659 2039.994 2 18284.946 (same as above)
43659 2039.994 1 20324.94 (same as above)
43659 28.8404 3 20411.4612 (same as above)
43659 28.8404 1 20440.3016 (same as above)
43659 5.7 6 20474.5016 (same as above)
43659 5.1865 2 20484.8746 (same as above)
43659 20.1865 4 20565.6206 (same as above)

Essentially, the running total at each row is the sum of the running total value from the previous row and the expression to be added from the current row (in this case, Unit Price * Order Quantity). That’s where the problems start in SQL Server.


The running total is calculated in a sequence and unfortunately, SQL Server does not guarantee the order of records in a result set which makes it difficult to compute the running total in a SET based operation. The solution to this problem has therefore been a CURSOR operation where each record is independently processed in a loop.


However, the T-SQL enhancements made in SQL Server 2012 have a possible solution to this problem. The the aggregate function SUM() has been enhanced to operate as a windowing function which now supports the ORDER BY clause in the OVER clause. This gives the ability for SUM to be performed in the intended sequence of the transactions. Here’s a query to demonstrate this:

USE AdventureWorks2012;
GO
SELECT SalesOrderID,
       SalesOrderDetailID,
       UnitPrice,
       OrderQty,
       LineTotal,
       SUM(UnitPrice * OrderQty) 
            OVER (PARTITION BY SalesOrderId 
                  ORDER BY SalesOrderDetailID
                 ) 
                 AS RunningTotal
FROM Sales.SalesOrderDetail
WHERE SalesOrderID = 43659;
GO

NOTE: Attempting to run this query against a SQL Server 2008R2 or lower environment will result in an error related to missing support of the ORDER BY clause. They support the use of OVER and PARTITION BY with the SUM() function, but do not support ORDER BY.


The result set available from the query is shown below.


image


As can be seen from the result set, Microsoft SQL Server auto-calculated the running total for the given Sales Order in a given sequence. These results also match with the table provided above.


Finally, the purpose of a running total is that a system/user should not have to maintain a record of all the transactions. Hence, the maximum value of the running total should match the Sub-total stored on the SalesOrder header record.

USE AdventureWorks2012;
GO
SELECT SalesOrderID,
       SubTotal,
       TaxAmt,
       Freight,
       TotalDue
FROM Sales.SalesOrderHeader
WHERE SalesOrderID = 43659;
GO

image


Summary


The ORDER BY clause can be used when using SUM() over a window defined by the OVER clause to generate a running total of values in a given sequence.


It is really amazing that the solution to  a problem that has challenged many brains over the years can be this simple. Kudos to the #SQLServer product development team at #Microsoft. Thanks, guys!


Further Reading



  • Windowing functions – Underappreciated features of Microsoft SQL Server [Link]
  • Aggregate Function – SUM [Link]
  • Over Clause [Link]
  • Running Totals, Wikipedia [Link]


Until we meet next time,



Be courteous. Drive responsibly.

Advertisements

4 thoughts on “#0330 – SQL Server 2012 – Running Total Problem and Windowing Functions

  1. WayneS

    One thing to be careful of here – your example uses the default for the ROWS/RANGE clause, which is RANGE BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW. Range works by the value of the column, so if you have a duplicate value then those are added together (based upon the ORDER BY column(s) within a partition).
    A quick example:

    DECLARE @test TABLE (RowID INTEGER IDENTITY, TotalValue INTEGER);
    INSERT INTO @test (TotalValue) VALUES (1), (2), (3), (3), (4);
    SELECT RowID, TotalValue,
    SUM(TotalValue) OVER (ORDER BY TotalValue) AS [Range],
    SUM(TotalValue) OVER (ORDER BY TotalValue ROWS UNBOUNDED PRECEDING) AS [Rows]
    FROM @test;

    Results:

    RowID TotalValue Range Rows
    ———– ———– ———– ———–
    1 1 1 1
    2 2 3 3
    3 3 9 6
    4 3 9 9
    5 4 13 13

    You can see how the two values of “3” were handled differently. For a true “Running Total”, ensure that you specify the ROWS clause and don’t just take the default.

    Like

    Reply
  2. Nakul Vachhrajani

    Thank-you for the feedback, Wayne! Yes, the duplicate values will have an effect in the default clause that my formula uses. The UNBOUNDED PRECEDING clause must be mentioned if the column being aggregated may have duplicate values which need to be eliminated.

    Like

    Reply
  3. WayneS

    Nakul,

    No, you need to not use the default ROWS/RANGE clause of RANGE. To accomplish that, you need to specify ROWS.
    UNBOUNDED PRECEDING is the starting window declaration. In my example, I used UNBOUNDED PRECEDING, and accepted the default ending window declaration of CURRENT ROW. When specifing ROWS (or RANGE), you need to specify the window boundaries, so in my example I ended up using “ROWS UNBOUNDED PRECEDING”. I could have also used “ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW”.

    You might want to look over the “OVER Clause” BOL at [http://technet.microsoft.com/en-us/library/ms189461.aspx][1]

    [1]: http://technet.microsoft.com/en-us/library/ms189461.aspx

    Like

    Reply
  4. brimhj

    Nakul, Wayne was pointing out that ROWS must be explicitly stated since you are performing a running totals scenario. Without specifying ROWS, then the default is RANGE BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW (e.g. it is Range and not ROW).

    Like

    Reply

Let me know what you think about this post by leaving your feedback here!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s