Last modified: March 30, 2024

This article is written in: 🇺🇸

Window Functions in SQL

Window functions in SQL are powerful tools that allow you to perform calculations across a set of table rows that are related to the current row. Unlike aggregate functions, window functions do not collapse rows into a single output row; instead, they retain the individual row identities while providing additional analytical capabilities. Window functions are essential for advanced data analysis, reporting, and generating insights that require comparisons or calculations across related rows.

After reading the material, you should be able to answer the following questions:

  1. What are window functions in SQL, and how do they differ from aggregate functions?
  2. What are some common window functions, and what are their typical use cases?
  3. How does the OVER clause define the window for window functions, and what components can it include?
  4. What are the advantages and limitations of using window functions compared to traditional methods like self-joins?
  5. What best practices should be followed when implementing window functions in SQL to ensure optimal performance and maintainability?

Common Window Functions

Here are some of the most commonly used window functions in SQL:

Setting Up Example Tables

Suppose we have two tables: Sales and Products.

Sales Table

SaleID ProductID SaleDate Quantity Price
1 101 2024-01-05 10 15.00
2 102 2024-01-07 5 25.00
3 101 2024-01-10 20 15.00
4 103 2024-01-12 7 30.00
5 102 2024-01-15 10 25.00
6 101 2024-01-20 15 15.00
7 103 2024-01-22 5 30.00
8 104 2024-01-25 12 20.00
9 102 2024-01-28 8 25.00
10 104 2024-01-30 10 20.00

Products Table

ProductID ProductName Category
101 Widget A Gadgets
102 Widget B Gadgets
103 Gizmo C Widgets
104 Gizmo D Widgets

Understanding the OVER Clause

Window functions use the OVER clause to define the window or the set of rows the function should operate on. The OVER clause can include:

ROW_NUMBER Function

The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition, starting at 1 for the first row in each partition.

Example: Assigning Row Numbers to Sales per Product

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price RowNum
1 101 2024-01-05 10 15.00 1
3 101 2024-01-10 20 15.00 2
6 101 2024-01-20 15 15.00 3
2 102 2024-01-07 5 25.00 1
5 102 2024-01-15 10 25.00 2
9 102 2024-01-28 8 25.00 3
4 103 2024-01-12 7 30.00 1
7 103 2024-01-22 5 30.00 2
8 104 2024-01-25 12 20.00 1
10 104 2024-01-30 10 20.00 2

RANK and DENSE_RANK Functions

Both RANK() and DENSE_RANK() assign a rank to each row within a partition. The difference lies in how they handle ties.

Example: Ranking Sales by Quantity per Product

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS Rank,
    DENSE_RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS DenseRank
FROM
    Sales
ORDER BY
    ProductID,
    Quantity DESC;

Result

SaleID ProductID SaleDate Quantity Price Rank DenseRank
101 101 2024-01-10 20 15.00 1 1
6 101 2024-01-20 15 15.00 2 2
1 101 2024-01-05 10 15.00 3 3
102 102 2024-01-15 10 25.00 1 1
9 102 2024-01-28 8 25.00 2 2
2 102 2024-01-07 5 25.00 3 3
103 103 2024-01-12 7 30.00 1 1
7 103 2024-01-22 5 30.00 2 2
104 104 2024-01-25 12 20.00 1 1
10 104 2024-01-30 10 20.00 2 2

NTILE Function

The NTILE() function distributes the rows in an ordered partition into a specified number of roughly equal groups.

Example: Dividing Sales into Quartiles per Product

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    NTILE(4) OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS Quartile
FROM
    Sales
ORDER BY
    ProductID,
    Quantity DESC;

Result

SaleID ProductID SaleDate Quantity Price Quartile
101 101 2024-01-10 20 15.00 1
6 101 2024-01-20 15 15.00 2
1 101 2024-01-05 10 15.00 3
102 102 2024-01-15 10 25.00 1
9 102 2024-01-28 8 25.00 2
2 102 2024-01-07 5 25.00 3
103 103 2024-01-12 7 30.00 1
7 103 2024-01-22 5 30.00 2
104 104 2024-01-25 12 20.00 1
10 104 2024-01-30 10 20.00 2

LEAD and LAG Functions

LEAD() and LAG() functions allow you to access subsequent and preceding rows' data without the need for self-joins.

Example: Comparing Current Sale with Previous Sale Quantity per Product

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    LAG(Quantity, 1) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity,
    Quantity - LAG(Quantity, 1) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS QuantityChange
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price PreviousQuantity QuantityChange
1 101 2024-01-05 10 15.00 NULL NULL
3 101 2024-01-10 20 15.00 10 10
6 101 2024-01-20 15 15.00 20 -5
2 102 2024-01-07 5 25.00 NULL NULL
5 102 2024-01-15 10 25.00 5 5
9 102 2024-01-28 8 25.00 10 -2
4 103 2024-01-12 7 30.00 NULL NULL
7 103 2024-01-22 5 30.00 7 -2
8 104 2024-01-25 12 20.00 NULL NULL
10 104 2024-01-30 10 20.00 12 -2

Aggregate Window Functions

Aggregate functions like SUM(), AVG(), and COUNT() can also be used as window functions to perform calculations across a window of rows.

Example: Calculating Running Total of Sales Amount per Product

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    Quantity * Price AS SaleAmount,
    SUM(Quantity * Price) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price SaleAmount RunningTotal
1 101 2024-01-05 10 15.00 150.00 150.00
3 101 2024-01-10 20 15.00 300.00 450.00
6 101 2024-01-20 15 15.00 225.00 675.00
2 102 2024-01-07 5 25.00 125.00 125.00
5 102 2024-01-15 10 25.00 250.00 375.00
9 102 2024-01-28 8 25.00 200.00 575.00
4 103 2024-01-12 7 30.00 210.00 210.00
7 103 2024-01-22 5 30.00 150.00 360.00
8 104 2024-01-25 12 20.00 240.00 240.00
10 104 2024-01-30 10 20.00 200.00 440.00

Handling NULL Values in Window Functions

Window functions typically handle NULL values based on the specific function's behavior. For example, SUM() ignores NULL values, while ROW_NUMBER() assigns a unique number regardless of NULLs.

Example: Assigning Row Numbers with NULL Quantities

Suppose we have an additional sale with a NULL quantity.

Updated Sales Table

SaleID ProductID SaleDate Quantity Price
11 101 2024-01-25 NULL 15.00

Query: Assigning Row Numbers Including NULL Quantities

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price RowNum
1 101 2024-01-05 10 15.00 1
3 101 2024-01-10 20 15.00 2
6 101 2024-01-20 15 15.00 3
11 101 2024-01-25 NULL 15.00 4
... ... ... ... ... ...

Practical Tips for Using Window Functions

ROW_NUMBER() OVER (...) AS RowNumber

RANK() OVER (PARTITION BY Category ORDER BY SaleAmount DESC) AS SaleRank

AVG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS MovingAverage

WITH RankedSales AS (
      SELECT
          SaleID,
          ProductID,
          SaleDate,
          Quantity,
          Price,
          RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS SaleRank
      FROM
          Sales
  )
  SELECT
      rs.SaleID,
      rs.ProductID,
      rs.SaleDate,
      rs.Quantity,
      rs.Price
  FROM
      RankedSales rs
  WHERE
      rs.SaleRank = 1;

Window Functions vs. Aggregate Functions

While both window functions and aggregate functions perform calculations over sets of rows, they differ in key ways:

Example: Comparing Aggregate and Window Functions

Aggregate Function Example: Total Sales per Product

SELECT
    ProductID,
    SUM(Quantity * Price) AS TotalSales
FROM
    Sales
GROUP BY
    ProductID;

Window Function Example: Total Sales per Product Alongside Each Sale

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    SUM(Quantity * Price) OVER (PARTITION BY ProductID) AS TotalSalesPerProduct
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Combining Multiple Window Functions

You can use multiple window functions within a single query to perform various analyses simultaneously.

Example: Sales Analysis with Multiple Window Functions

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SaleDate) AS RowNum,
    RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS QuantityRank,
    SUM(Quantity * Price) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price RowNum QuantityRank RunningTotal
1 101 2024-01-05 10 15.00 1 2 150.00
3 101 2024-01-10 20 15.00 2 1 450.00
6 101 2024-01-20 15 15.00 3 3 675.00
11 101 2024-01-25 NULL 15.00 4 4 675.00
2 102 2024-01-07 5 25.00 1 3 125.00
5 102 2024-01-15 10 25.00 2 1 375.00
9 102 2024-01-28 8 25.00 3 2 575.00
4 103 2024-01-12 7 30.00 1 1 210.00
7 103 2024-01-22 5 30.00 2 2 360.00
8 104 2024-01-25 12 20.00 1 1 240.00
10 104 2024-01-30 10 20.00 2 2 440.00

Window Frames

Window frames define the subset of rows within the partition to be used for calculations in window functions. They are specified using the ROWS BETWEEN or RANGE BETWEEN clauses.

Example: Calculating a Moving Average of Quantity

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    AVG(Quantity) OVER (
        PARTITION BY ProductID
        ORDER BY SaleDate
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS MovingAverage
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price MovingAverage
1 101 2024-01-05 10 15.00 10.00
3 101 2024-01-10 20 15.00 15.00
6 101 2024-01-20 15 15.00 15.00
11 101 2024-01-25 NULL 15.00 12.50
2 102 2024-01-07 5 25.00 5.00
5 102 2024-01-15 10 25.00 7.50
9 102 2024-01-28 8 25.00 7.67
4 103 2024-01-12 7 30.00 7.00
7 103 2024-01-22 5 30.00 6.00
8 104 2024-01-25 12 20.00 12.00
10 104 2024-01-30 10 20.00 11.00

Practical Use Cases for Window Functions

Combining Window Functions with Other SQL Features

Window functions can be combined with Common Table Expressions (CTEs), subqueries, and other SQL constructs to perform complex data transformations and analyses.

Example: Identifying Top 2 Sales per Product

WITH RankedSales AS (
    SELECT
        SaleID,
        ProductID,
        SaleDate,
        Quantity,
        Price,
        RANK() OVER (PARTITION BY ProductID ORDER BY Quantity DESC) AS SaleRank
    FROM
        Sales
)
SELECT
    rs.SaleID,
    rs.ProductID,
    rs.SaleDate,
    rs.Quantity,
    rs.Price
FROM
    RankedSales rs
WHERE
    rs.SaleRank <= 2
ORDER BY
    rs.ProductID,
    rs.SaleRank;

Result

SaleID ProductID SaleDate Quantity Price
3 101 2024-01-10 20 15.00
6 101 2024-01-20 15 15.00
5 102 2024-01-15 10 25.00
9 102 2024-01-28 8 25.00
4 103 2024-01-12 7 30.00
7 103 2024-01-22 5 30.00
8 104 2024-01-25 12 20.00
10 104 2024-01-30 10 20.00

Comparing Window Functions with Self-Joins

Before window functions were widely supported, similar analyses often required complex self-joins. Window functions simplify these operations, making queries more readable and efficient.

Example: Using Window Functions vs. Self-Joins to Compare Current and Previous Sales

Using Window Functions

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Using Self-Joins

SELECT
    s1.SaleID,
    s1.ProductID,
    s1.SaleDate,
    s1.Quantity,
    s1.Price,
    s2.Quantity AS PreviousQuantity
FROM
    Sales s1
LEFT JOIN
    Sales s2
    ON s1.ProductID = s2.ProductID
    AND s1.SaleDate > s2.SaleDate
    AND NOT EXISTS (
        SELECT 1
        FROM Sales s3
        WHERE s3.ProductID = s1.ProductID
        AND s3.SaleDate > s2.SaleDate
        AND s3.SaleDate < s1.SaleDate
    )
ORDER BY
    s1.ProductID,
    s1.SaleDate;

Limitations and Considerations

Advanced Window Function Features

SUM(Quantity) OVER (
      PARTITION BY ProductID
      ORDER BY SaleDate
      ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
  ) AS SumQuantity

ROW_NUMBER() OVER (
      PARTITION BY Category, SubCategory
      ORDER BY SaleDate DESC
  ) AS RowNum

AVG(Price) OVER (
      PARTITION BY ProductID
      ORDER BY SaleDate
      RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW
  ) AS WeeklyAveragePrice

Example Use Case: Sales Trend Analysis

Suppose you want to analyze sales trends by calculating the percentage change in quantity compared to the previous sale for each product.

SELECT
    SaleID,
    ProductID,
    SaleDate,
    Quantity,
    Price,
    LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) AS PreviousQuantity,
    CASE 
        WHEN LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate) IS NULL THEN NULL
        ELSE ((Quantity - LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate)) * 100.0) / LAG(Quantity) OVER (PARTITION BY ProductID ORDER BY SaleDate)
    END AS PercentageChange
FROM
    Sales
ORDER BY
    ProductID,
    SaleDate;

Result

SaleID ProductID SaleDate Quantity Price PreviousQuantity PercentageChange
1 101 2024-01-05 10 15.00 NULL NULL
3 101 2024-01-10 20 15.00 10 100.00
6 101 2024-01-20 15 15.00 20 -25.00
11 101 2024-01-25 NULL 15.00 15 -100.00
2 102 2024-01-07 5 25.00 NULL NULL
5 102 2024-01-15 10 25.00 5 100.00
9 102 2024-01-28 8 25.00 10 -20.00
4 103 2024-01-12 7 30.00 NULL NULL
7 103 2024-01-22 5 30.00 7 -28.57
8 104 2024-01-25 12 20.00 NULL NULL
10 104 2024-01-30 10 20.00 12 -16.67

Table of Contents

    Window Functions in SQL
    1. Common Window Functions
    2. Setting Up Example Tables
    3. Understanding the OVER Clause
    4. ROW_NUMBER Function
      1. Example: Assigning Row Numbers to Sales per Product
    5. RANK and DENSE_RANK Functions
      1. Example: Ranking Sales by Quantity per Product
    6. NTILE Function
      1. Example: Dividing Sales into Quartiles per Product
    7. LEAD and LAG Functions
      1. Example: Comparing Current Sale with Previous Sale Quantity per Product
    8. Aggregate Window Functions
      1. Example: Calculating Running Total of Sales Amount per Product
    9. Handling NULL Values in Window Functions
      1. Example: Assigning Row Numbers with NULL Quantities
    10. Practical Tips for Using Window Functions
    11. Window Functions vs. Aggregate Functions
      1. Example: Comparing Aggregate and Window Functions
    12. Combining Multiple Window Functions
      1. Example: Sales Analysis with Multiple Window Functions
    13. Window Frames
      1. Example: Calculating a Moving Average of Quantity
    14. Practical Use Cases for Window Functions
    15. Combining Window Functions with Other SQL Features
      1. Example: Identifying Top 2 Sales per Product
    16. Comparing Window Functions with Self-Joins
      1. Example: Using Window Functions vs. Self-Joins to Compare Current and Previous Sales
    17. Limitations and Considerations
    18. Advanced Window Function Features
    19. Example Use Case: Sales Trend Analysis