Returning a result set in random order

At times you might need to return the SQL Server data to the client in a random order. Here's how.

This Content Component encountered an error

At times, you might need to return the SQL Server data to the client in a random order. For instance, in educational environment you might want to randomize the order of questions and answers on a test, so that students can't easily cheat by sharing the answer IDs. Similarly, if you want to return a different image each time someone visits your website you can run a "SELECT TOP 1" query on the list of your image files and order them...

randomly.

Unfortunately SQL Server doesn't provide built-in functionality for random sorting of the result set. You can use the ORDER BY clause to sort the results by different columns returned in ascending or descending order. However, that doesn't really give you truly random results. The good news is that a bit of coding can help you get around this limitation. Here is how you can do that.

SQL Server does have a built-in function that returns a random number, which can be executed as follows:

SELECT RAND()

Result: 0.66278421915814423

You can specify a seed for the RAND() function within the parentheses, but executing the function this way returns the same number every time for the same seed. For instance, the random of 2 will always be 0.7136106261841817.

To demonstrate returning a random result set, I'll use the 'authors' table in the 'pubs' database and return the list of author names randomly. I first get a list of all authors into a temporary table, as follows:

CREATE TABLE #temp (
 au_lname VARCHAR(30) NOT NULL, 
 au_fname VARCHAR(30) NOT NULL, 
 sort_column INT NULL)

INSERT #temp (
 au_lname, 
 au_fname
  ) 

SELECT  
  DISTINCT  
 au_lname, 
 au_fname  
FROM  
 authors 

If you check the list of authors in the temporary table at this point it is ordered alphabetically. That is because I used the DISTINCT keyword, which automatically orders the results of a SELECT. There are only two ways I can sort on the name, so I need something else--I'll use the sort_column column to order the result set.

Next, I will add an identity column to the table, to get a simple enumeration. As you might be aware the ALTER TABLE syntax supports adding an identity column to the table. The syntax expects the seed and increment values for the IDENTITY property, defaulting both values to 1. The trick is to start the identity column with a different seed (or a different increment) for each execution of the script. So I use the RAND function (without specifying the randomization seed) to get the random identity seed:

DECLARE @i INT,
 @sql VARCHAR(300)

-- get a random seed for the 
-- IDENTITY column
SET @i = ROUND(RAND() * 10, 0)

SELECT @sql = 'ALTER TABLE #temp ADD dummy_identity INT IDENTITY(' 
SELECT @sql = @SQL + CAST(@i AS VARCHAR) + ', 1)'
EXEC(@sql)

Ordering my result set on the identity column will not provide the functionality I need, even though the identity seed will be different for each run the result set will always be sorted the same way. Therefore, I need to modify identity values randomly. SQL Server won't let me explicitly update the identity column values; therefore, I'll copy dummy_identity values into the existing sort_column column:

SELECT @sql = ' '

SELECT @sql = @sql + ' UPDATE #temp SET sort_column = dummy_identity '
EXEC(@sql)

Notice that I had to reinitialize @sql variable. Since I'm using the EXEC syntax the scope of execution of the dynamic SQL statement is different from the rest of the batch. By the way, sp_executesql behaves the same way. Had I tried executing the script without reinitializing the variable I would have received a runtime error informing me that dummy_identity column does not exist.

Next, I'll update the sort_column column randomly by running through a cursor on a temporary table:

DECLARE @sort_column INT
DECLARE temp_cursor CURSOR FOR
SELECT sort_column FROM #temp
OPEN temp_cursor
FETCH NEXT FROM temp_cursor INTO @sort_column
WHILE @@FETCH_STATUS = 0
BEGIN
 UPDATE #temp SET sort_column = sort_column * ROUND(RAND() * 25, 0)
 WHERE sort_column = @sort_column
 FETCH NEXT FROM temp_cursor INTO @sort_column
END
CLOSE temp_cursor
DEALLOCATE temp_cursor

Finally, I'll read values from the temp table and sort the results based on sort_id:

SELECT  au_lname, 
  au_fname
 FROM #temp ORDER BY sort_column
-- clean up:
DROP TABLE #temp

Each execution of this script will return the list of author names in a random order. Try it out!

Reader Feedback

Les F. writes: Why go to the effort of creating an identity column and then copying it to the sort order column? Why not just skip this step and update the sort order column as in the following:

DECLARE @sort_column INT 
DECLARE temp_cursor CURSOR FOR SELECT sort_column FROM #temp 
OPEN temp_cursor
 FETCH NEXT FROM temp_cursor INTO @sort_column 
WHILE @@FETCH_STATUS = 0 
BEGIN 
UPDATE #temp SET sort_column =  ROUND(RAND() * 25, 0) 
 WHERE sort_column = @sort_column 

-- notice how I don't multiply the sort_column with the random value, 
-- I just set it!

FETCH NEXT FROM temp_cursor INTO @sort_column 
END
 CLOSE temp_cursor
 DEALLOCATE temp_cursor

Baya responds: If you skip the step of adding the identity column and copying its values into another (non-identity) column then @sort_column will always be null and the cursor won't update any rows. Therefore you will always get the result set in the same order. If you'd like to try it just comment out the portion that adds the identity column and run your cursor instead of the one provided in the article. Every execution will return the list of authors in the same order they appear in Authors table.

Ankura. writes: I think we still do not need to create an identity column and run an 'EXPLICIT CURSOR'. Replace the script for addition of identity column as well as cursor block. Please note that some of the values in sort_column are duplicated, which can be avoided by increasing the multiplying factor to a LARGE number:

set rowcount 1 while
exists(select 1 from #temp where sort_column is NULL)
begin update #temp
set sort_column = ROUND(RAND() * 25, 0)
where sort_column is NULL
if @@rowcount = 0 break
end
set rowcount 0

Baya responds: I agree. Both solutions yield similar results. If you're against using cursors and/or identity columns, then the second solution is your friend. I don't think the performance would be better/worse either way since you will still have to loop through the records whether using a CURSOR or WHILE.

For More Information

  • What do you think about this tip? E-mail the Editor at tdichiara@techtarget.com with your feedback.
  • The Best SQL Server Web Links: tips, tutorials, scripts, and more.
  • Have an SQL Server tip to offer your fellow DBA's and developers? The best tips submitted will receive a cool prize--submit your tip today!
  • Ask your technical SQL Server questions--or help out your peers by answering them--in our live discussion forums.
  • Check out our Ask the Experts feature: Our SQL, Database Design, Oracle, SQL Server, DB2, metadata, and data warehousing gurus are waiting to answer your toughest questions.

This was first published in May 2002

Dig deeper on SQL Server and .NET development

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchCloudComputing

SearchSoftwareQuality

SearchSOA

TheServerSide

SearchCloudApplications

Close