I had someone reach out about generate_series() recently, saying they hadn’t realized this was a new feature in SQL Server 2022. They were wondering if it was better than using a tally table.
I didn’t want to do an exhaustive test, but I thought I’d take a minute and try a couple simple things just to see.
A First Test
The first thing was to just generate a million numbers. Rather than just get the numbers. I decided to use a quick DATEADD() to create a list of calendar dates. Here’s the code:
SET STATISTICS IO ON; SET STATISTICS TIME ON; -- Create a Tally table with 1 million numbers WITH Tally (n) AS ( SELECT TOP (1000000) ROW_NUMBER () OVER (ORDER BY (SELECT NULL)) AS Number FROM master.dbo.spt_values a CROSS JOIN master.dbo.spt_values b) SELECT DATEADD(DAY, n, GETDATE()) FROM tally SELECT DATEADD( DAY, value, GETDATE()) FROM GENERATE_SERIES(1, 1000000, 1)
Since this does read from tables, I ran it twice. The first time, the tally table took 243ms, so I re-ran it and saw this drop to 172ms. The results were consistent for Generate_series, which was 110ms.
A Second Test
I grabbed Jeff Moden’s code for random numbers and adjusted a second query to use GENERATE_SERIES(). The code is below.
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
--===== Declare some obviously named variables
DECLARE @NumberOfRows INT,
@StartValue INT,
@EndValue INT,
@Range INT
;
--===== Preset the variables to known values
SELECT @NumberOfRows = 1000000,
@StartValue = 400,
@EndValue = 500,
@Range = @EndValue - @StartValue + 1
;
--===== Conditionally drop the test table to make reruns easier in SSMS
IF OBJECT_ID('tempdb..#SomeTestTable','U') IS NOT NULL
DROP TABLE #SomeTestTable
;
--===== Create the test table with "random constrained" integers and floats
-- within the parameters identified in the variables above.
SELECT TOP (@NumberOfRows)
SomeRandomInteger = ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SomeRandomFloat = RAND(CHECKSUM(NEWID())) * @Range + @StartValue
INTO #SomeTestTable
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
SELECT TOP (@NumberOfRows)
SomeRandomInteger = ABS(CHECKSUM(NEWID())) % @Range + @StartValue,
SomeRandomFloat = RAND(CHECKSUM(NEWID())) * @Range + @StartValue
INTO #SomeTestTable2
FROM GENERATE_SERIES(1, @NumberOfRows, 1)
When I ran this, I see these results:
Execution times are close. Slightly faster with GENERATE_SERIES(), but fairly consistent across runs. In running this 10 times, there were 3 runs where the tally table was faster, and once just under 300ms. A few times the time was the same, but always within 15-16ms. Not sure that means much.
This isn’t a really exhaustive test, and don’t take this as a recommendation either way for your code. Test how they both work in your system, and certainly think about the impact of storing a tally table vs. generating one on the fly vs the GENERATE_SERIES().
However, it seems that GENERATE_SERIES() is worth looking at if you are on SQL Server 2022 or later.
