I had someone reach out about generate_series() recently, saying they hadn’t realized this was a new feature in SQL Server 2022. They were wondering if it was better than using a tally table.

I didn’t want to do an exhaustive test, but I thought I’d take a minute and try a couple simple things just to see.

A First Test

The first thing was to just generate a million numbers. Rather than just get the numbers. I decided to use a quick DATEADD() to create a list of calendar dates. Here’s the code:

SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Create a Tally table with 1 million numbers
WITH Tally (n)
AS ( SELECT TOP (1000000)
             ROW_NUMBER () OVER (ORDER BY
                                   (SELECT NULL)) AS Number
      FROM
        master.dbo.spt_values a
        CROSS JOIN master.dbo.spt_values b)
SELECT DATEADD(DAY, n, GETDATE())
  FROM tally
SELECT DATEADD( DAY, value, GETDATE()) FROM GENERATE_SERIES(1, 1000000, 1)

Since this does read from tables, I ran it twice. The first time, the tally table took 243ms, so I re-ran it and saw this drop to 172ms. The results were consistent for Generate_series, which was 110ms.

A Second Test

I grabbed Jeff Moden’s code for random numbers and adjusted a second query to use GENERATE_SERIES(). The code is below.
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
--===== Declare some obviously named variables
DECLARE @NumberOfRows INT,
@StartValue   INT,
@EndValue     INT,
@Range        INT
;
--===== Preset the variables to known values
SELECT @NumberOfRows = 1000000,
@StartValue   = 400,
@EndValue     = 500,
@Range        = @EndValue - @StartValue + 1
;
--===== Conditionally drop the test table to make reruns easier in SSMS
IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL
DROP TABLE #SomeTestTable
;
--===== Create the test table with "random constrained" integers and floats
-- within the parameters identified in the variables above.
SELECT TOP (@NumberOfRows)
SomeRandomInteger =  ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SomeRandomFloat   = RAND(CHECKSUM(NEWID())) * @Range + @StartValue
INTO #SomeTestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
SELECT TOP (@NumberOfRows)
SomeRandomInteger =  ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SomeRandomFloat   = RAND(CHECKSUM(NEWID())) * @Range + @StartValue
INTO #SomeTestTable2
FROM GENERATE_SERIES(1, @NumberOfRows, 1)

When I ran this, I see these results:

2025-02_0344

Execution times are close. Slightly faster with GENERATE_SERIES(), but fairly consistent across runs. In running this 10 times, there were 3 runs where the tally table was faster, and once just under 300ms. A few times the time was the same, but always within 15-16ms. Not sure that means much.

This isn’t a really exhaustive test, and don’t take this as a recommendation either way for your code. Test how they both work in your system, and certainly think about the impact of storing a tally table vs. generating one on the fly vs the GENERATE_SERIES().

However, it seems that GENERATE_SERIES() is worth looking at if you are on SQL Server 2022 or later.

Share.
Leave A Reply