Last modified: August 19, 2020
This article is written in: 🇺🇸
Window functions in SQL are powerful tools that allow you to perform calculations across a set of table rows that are related to the current row. Unlike aggregate functions, window functions do not collapse rows into a single output row; instead, they retain the individual row identities while providing additional analytical capabilities. Window functions are essential for advanced data analysis, reporting, and generating insights that require comparisons or calculations across related rows.
After reading the material, you should be able to answer the following questions:
OVER clause define the window for window functions, and what components can it include?Here are some of the most commonly used window functions in SQL:
ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition of a result set, starting at 1 for the first row in each partition.RANK(): Assigns a rank to each row within a partition of a result set, with gaps in ranking values when there are ties.DENSE_RANK(): Similar to RANK(), but without gaps in ranking values when there are ties.NTILE(): Distributes rows into a specified number of approximately equal groups.LEAD(): Provides access to a subsequent row’s data without the need for a self-join.LAG(): Provides access to a preceding row’s data without the need for a self-join.SUM(), AVG(), COUNT(), which perform aggregate calculations over a window.Suppose we have two tables: Sales and Products.
Sales Table
| SaleID | ProductID | SaleDate | Quantity | Price |
| 1 | 101 | 2024-01-05 | 10 | 15.00 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 |
Products Table
| ProductID | ProductName | Category |
| 101 | Widget A | Gadgets |
| 102 | Widget B | Gadgets |
| 103 | Gizmo C | Widgets |
| 104 | Gizmo D | Widgets |
OVER ClauseWindow functions use the OVER clause to define the window or the set of rows the function should operate on. The OVER clause can include:
PARTITION BY: Divides the result set into partitions to which the window function is applied.ORDER BY: Defines the logical order of rows within each partition.ROWS BETWEEN).The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition, starting at 1 for the first row in each partition.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | RowNum |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 1 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 2 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 3 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 1 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 2 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 3 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | 1 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 2 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | 1 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 2 |
ROW_NUMBER() function assigns a unique row number to each sale within its ProductID partition based on the SaleDate.Both RANK() and DENSE_RANK() assign a rank to each row within a partition. The difference lies in how they handle ties.
RANK(): Assigns the same rank to tied rows but leaves gaps in the ranking sequence.DENSE_RANK(): Assigns the same rank to tied rows without leaving gaps.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS Rank,
DENSE_RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS DenseRank
FROM
Sales
ORDER BY
ProductID,
Quantity DESC;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | Rank | DenseRank |
| 101 | 101 | 2024-01-10 | 20 | 15.00 | 1 | 1 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 2 | 2 |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 3 | 3 |
| 102 | 102 | 2024-01-15 | 10 | 25.00 | 1 | 1 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 2 | 2 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 3 | 3 |
| 103 | 103 | 2024-01-12 | 7 | 30.00 | 1 | 1 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 2 | 2 |
| 104 | 104 | 2024-01-25 | 12 | 20.00 | 1 | 1 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 2 | 2 |
RANK() and DENSE_RANK() are useful for identifying the position of rows within partitions, especially when dealing with ties.The NTILE() function distributes the rows in an ordered partition into a specified number of roughly equal groups.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
NTILE(4) OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS Quartile
FROM
Sales
ORDER BY
ProductID,
Quantity DESC;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | Quartile |
| 101 | 101 | 2024-01-10 | 20 | 15.00 | 1 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 2 |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 3 |
| 102 | 102 | 2024-01-15 | 10 | 25.00 | 1 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 2 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 3 |
| 103 | 103 | 2024-01-12 | 7 | 30.00 | 1 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 2 |
| 104 | 104 | 2024-01-25 | 12 | 20.00 | 1 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 2 |
NTILE(4) function divides each ProductID partition into four quartiles based on Quantity.LEAD() and LAG() functions allow you to access subsequent and preceding rows' data without the need for self-joins.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
LAG(Quantity, 1) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity,
Quantity - LAG(Quantity, 1) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS QuantityChange
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | PreviousQuantity | QuantityChange |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | NULL | NULL |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 10 | 10 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 20 | -5 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | NULL | NULL |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 5 | 5 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 10 | -2 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | NULL | NULL |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 7 | -2 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | NULL | NULL |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 12 | -2 |
LAG(Quantity, 1) retrieves the quantity from the previous sale within the same ProductID partition.LEAD() can similarly retrieve data from subsequent rows.Aggregate functions like SUM(), AVG(), and COUNT() can also be used as window functions to perform calculations across a window of rows.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
Quantity * Price AS SaleAmount,
SUM(Quantity * Price) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | SaleAmount | RunningTotal |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 150.00 | 150.00 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 300.00 | 450.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 225.00 | 675.00 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 125.00 | 125.00 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 250.00 | 375.00 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 200.00 | 575.00 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | 210.00 | 210.00 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 150.00 | 360.00 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | 240.00 | 240.00 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 200.00 | 440.00 |
SUM(Quantity * Price) OVER (...) calculates a running total of sales amounts within each ProductID partition ordered by SaleDate.ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW defines the range from the first row to the current row for cumulative calculations.Window functions typically handle NULL values based on the specific function's behavior. For example, SUM() ignores NULL values, while ROW_NUMBER() assigns a unique number regardless of NULLs.
Suppose we have an additional sale with a NULL quantity.
Updated Sales Table
| SaleID | ProductID | SaleDate | Quantity | Price |
| 11 | 101 | 2024-01-25 | NULL | 15.00 |
Query: Assigning Row Numbers Including NULL Quantities
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | RowNum |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 1 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 2 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 3 |
| 11 | 101 | 2024-01-25 | NULL | 15.00 | 4 |
| ... | ... | ... | ... | ... | ... |
ROW_NUMBER() function assigns a row number to the sale with NULL quantity without any issues.NULL values differently, depending on their logic.
ROW_NUMBER() OVER (...) AS RowNumber
PARTITION BY and ORDER BY: Leverage PARTITION BY to segment data and ORDER BY to define the sequence within each partition.
RANK() OVER (PARTITION BY Category ORDER BY SaleAmount DESC) AS SaleRank
ROWS BETWEEN to define dynamic ranges for calculations.
AVG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS MovingAverage
Avoid Overusing Window Functions: While powerful, excessive use of window functions can lead to complex and less performant queries. Use them judiciously.
Understand Performance Implications: Window functions can impact query performance, especially on large datasets. Ensure proper indexing and consider query optimization techniques.
Combine with Other SQL Features: Window functions can be combined with CTEs (Common Table Expressions), subqueries, and other SQL features for more complex analyses.
WITH RankedSales AS (
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS SaleRank
FROM
Sales
)
SELECT
rs.SaleID,
rs.ProductID,
rs.SaleDate,
rs.Quantity,
rs.Price
FROM
RankedSales rs
WHERE
rs.SaleRank = 1;
While both window functions and aggregate functions perform calculations over sets of rows, they differ in key ways:
Window Functions: Retain individual row identities while providing additional calculated data.
Usage with GROUP BY:
GROUP BY to define grouping.OVER clause to define partitions and ordering without collapsing rows.Aggregate Function Example: Total Sales per Product
SELECT
ProductID,
SUM(Quantity * Price) AS TotalSales
FROM
Sales
GROUP BY
ProductID;
Window Function Example: Total Sales per Product Alongside Each Sale
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
SUM(Quantity * Price) OVER (PARTITION BY ProductID) AS TotalSalesPerProduct
FROM
Sales
ORDER BY
ProductID,
SaleDate;
ProductID.You can use multiple window functions within a single query to perform various analyses simultaneously.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum,
RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS QuantityRank,
SUM(Quantity * Price) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | RowNum | QuantityRank | RunningTotal |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 1 | 2 | 150.00 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 2 | 1 | 450.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 3 | 3 | 675.00 |
| 11 | 101 | 2024-01-25 | NULL | 15.00 | 4 | 4 | 675.00 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 1 | 3 | 125.00 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 2 | 1 | 375.00 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 3 | 2 | 575.00 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | 1 | 1 | 210.00 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 2 | 2 | 360.00 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | 1 | 1 | 240.00 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 2 | 2 | 440.00 |
ROW_NUMBER() assigns a unique row number within each ProductID partition.RANK() assigns ranks based on Quantity within each ProductID.SUM() OVER (...) calculates a running total of sales amounts per ProductID.Window frames define the subset of rows within the partition to be used for calculations in window functions. They are specified using the ROWS BETWEEN or RANGE BETWEEN clauses.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
AVG(Quantity) OVER (
PARTITION BY ProductID
ORDER BY SaleDate
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS MovingAverage
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | MovingAverage |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | 10.00 |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 15.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 15.00 |
| 11 | 101 | 2024-01-25 | NULL | 15.00 | 12.50 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | 5.00 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 7.50 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 7.67 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | 7.00 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 6.00 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | 12.00 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 11.00 |
AVG(Quantity) function calculates the average quantity over the current row and the two preceding rows within each ProductID partition.Window functions can be combined with Common Table Expressions (CTEs), subqueries, and other SQL constructs to perform complex data transformations and analyses.
WITH RankedSales AS (
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS SaleRank
FROM
Sales
)
SELECT
rs.SaleID,
rs.ProductID,
rs.SaleDate,
rs.Quantity,
rs.Price
FROM
RankedSales rs
WHERE
rs.SaleRank <= 2
ORDER BY
rs.ProductID,
rs.SaleRank;
Result
| SaleID | ProductID | SaleDate | Quantity | Price |
| 3 | 101 | 2024-01-10 | 20 | 15.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 |
| 5 | 102 | 2024-01-15 | 10 | 25.00 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 |
| 7 | 103 | 2024-01-22 | 5 | 30.00 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 |
| 10 | 104 | 2024-01-30 | 10 | 20.00 |
RankedSales assigns a rank to each sale based on Quantity within each ProductID.Before window functions were widely supported, similar analyses often required complex self-joins. Window functions simplify these operations, making queries more readable and efficient.
Using Window Functions
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Using Self-Joins
SELECT
s1.SaleID,
s1.ProductID,
s1.SaleDate,
s1.Quantity,
s1.Price,
s2.Quantity AS PreviousQuantity
FROM
Sales s1
LEFT JOIN
Sales s2
ON s1.ProductID = s2.ProductID
AND s1.SaleDate > s2.SaleDate
AND NOT EXISTS (
SELECT 1
FROM Sales s3
WHERE s3.ProductID = s1.ProductID
AND s3.SaleDate > s2.SaleDate
AND s3.SaleDate < s1.SaleDate
)
ORDER BY
s1.ProductID,
s1.SaleDate;
SUM(Quantity) OVER (
PARTITION BY ProductID
ORDER BY SaleDate
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) AS SumQuantity
PARTITION BY and ORDER BY for more granular control.
ROW_NUMBER() OVER (
PARTITION BY Category, SubCategory
ORDER BY SaleDate DESC
) AS RowNum
RANGE Instead of ROWS: Define frames based on logical ranges rather than physical row counts.
AVG(Price) OVER (
PARTITION BY ProductID
ORDER BY SaleDate
RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW
) AS WeeklyAveragePrice
Suppose you want to analyze sales trends by calculating the percentage change in quantity compared to the previous sale for each product.
SELECT
SaleID,
ProductID,
SaleDate,
Quantity,
Price,
LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity,
CASE
WHEN LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) IS NULL THEN NULL
ELSE ((Quantity - LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate)) * 100.0) / LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate)
END AS PercentageChange
FROM
Sales
ORDER BY
ProductID,
SaleDate;
Result
| SaleID | ProductID | SaleDate | Quantity | Price | PreviousQuantity | PercentageChange |
| 1 | 101 | 2024-01-05 | 10 | 15.00 | NULL | NULL |
| 3 | 101 | 2024-01-10 | 20 | 15.00 | 10 | 100.00 |
| 6 | 101 | 2024-01-20 | 15 | 15.00 | 20 | -25.00 |
| 11 | 101 | 2024-01-25 | NULL | 15.00 | 15 | -100.00 |
| 2 | 102 | 2024-01-07 | 5 | 25.00 | NULL | NULL |
| 5 | 102 | 2024-01-15 | 10 | 25.00 | 5 | 100.00 |
| 9 | 102 | 2024-01-28 | 8 | 25.00 | 10 | -20.00 |
| 4 | 103 | 2024-01-12 | 7 | 30.00 | NULL | NULL |
| 7 | 103 | 2024-01-22 | 5 | 30.00 | 7 | -28.57 |
| 8 | 104 | 2024-01-25 | 12 | 20.00 | NULL | NULL |
| 10 | 104 | 2024-01-30 | 10 | 20.00 | 12 | -16.67 |
Quantity compared to the previous sale for each ProductID.