Something

by Jack

Randomising SQL Server Rows in a Column

Recently I had a requirement to create some stored procedures to Randomise data in our Microsoft SQL Server databases. Having a quick google around it seems most people just want to scramble things in their current text fields. Not what I want to do! Just swap rows around.

I finally came across this excellent article here, I modified some of them slightly as I wanted to run them as stored procedures. The idea being that the user specifies a profile in the application of various elements they want to randomise then it runs the stored procedure for the required elements.

Firstly a procedure to swap rows around:

CREATE PROCEDURE swap_row_data
@table_name_var NVARCHAR(30),
@col_name_ver NVARCHAR(40)
AS

DECLARE @update_command NVARCHAR(max)
SELECT @update_command = 'with generate_row_ids as (
select row_number() over (order by ' + @col_name_ver +') as orig_rownum,
row_number() over (order by newid()) as new_rownum,
*
from ' + @table_name_var +' )
update t1
set ' + @col_name_ver + ' = t2.' + @col_name_ver +
' from generate_row_ids t1
join generate_row_ids t2
on t1.orig_rownum = t2.new_rownum'

EXEC(@update_command)

The two parameters being the table name and the column name.

The second one randomised dates generating a value between the lowest and highest currently in the table:

CREATE PROCEDURE randomise_dates
@table_name_var NVARCHAR(30),
@col_name_ver NVARCHAR(40)
AS
DECLARE @update_command NVARCHAR(max)
SELECT @update_command = 'update ' + @table_name_var +
' set ' + @col_name_ver +' = (
select
dateadd(day,
abs(checksum(newid())) %
datediff(day, min(' + @col_name_ver + '), max(' + @col_name_ver + ')) + 1,
min(' + @col_name_ver +')
)
from ' + @table_name_var
+ ')'

EXEC (@update_command)

And the third randomises the numeric value in a field between 0.1 and 10, with a median of 1.

CREATE PROCEDURE randomise_numeric
@table_name_var NVARCHAR(30),
@col_name_ver NVARCHAR(40)
AS
DECLARE @update_command NVARCHAR(max)
SELECT @update_command = 'update ' + @table_name_var +
' set ' + @col_name_ver + ' = ' + @col_name_ver + ' * exp(log(10) * (checksum(newid()) % 1000 / 1000.))'

EXEC (@update_command)