Something

by Jack

Chunking a SQL Server insert statement dynamically

I had a need to have a dynamic insert to update data and recently had need to use this as a base to break a large insert down into smaller chunks.

It took quite a bit of effort, and credit is due to re-use of some code here.

This has a fairly specific use, and in general you should probably avoid using this in a live environment. I would only be using it in one down for maintenance.

It creates a Stored Procedure that accepts a To Table, a From Table and a chunking amount. The From table must have a primary key defined. All the columns in the From table must be in the To table. Information about each chunk is logged into the table track_large_loads To create the logging table the following SQL is used:

IF (SELECT 1 FROM sysobjects WHERE sysobjects.name = 'track_large_loads' ) IS NULL
BEGIN
CREATE TABLE track_large_loads (
  table_name NVARCHAR(max),
  start_time DATETIME, 
  end_time DATETIME, 
  row_count INT
)
END
GO

The stored procedure is:

-- drop old version
IF (SELECT 1 FROM sysobjects WHERE sysobjects.name = 'sp_large_load' ) IS NOT NULL
BEGIN
DROP PROCEDURE sp_large_load
END
GO
--Create a procedure to load records in blocks
CREATE PROCEDURE sp_large_load
	@FromTable VARCHAR(100),
	@ToTable VARCHAR(100),
	@Rows_Per_Iteration BIGINT
AS
BEGIN
  DECLARE @start_time AS DATETIME
  DECLARE @maximum_row AS BIGINT
  DECLARE @starting_row AS BIGINT
  DECLARE @stop_row AS BIGINT
  DECLARE @rows_added AS INT
  DECLARE @message AS VARCHAR(400)
  DECLARE @table_columns NVARCHAR(max) 
  DECLARE @insert_command NVARCHAR(max)
  DECLARE @insert_variables NVARCHAR(max)
  DECLARE @total_row_command NVARCHAR(max)
  DECLARE @table_columns_command NVARCHAR(max)
  DECLARE @primary_keys NVARCHAR(MAX) 
  DECLARE @primary_keys_command NVARCHAR(MAX)
  -- Basic Info Out 
  PRINT 'From Table: ' + @FromTable
  PRINT 'To Table: ' + @ToTable
  --Determine how many rows need to be processed in total
  SELECT @total_row_command = N'SELECT @maximum_row = Count(*) FROM ' + @FromTable
  --SELECT @maximum_row FROM 
  EXECUTE sp_executesql @total_row_command, N'@maximum_row BIGINT OUTPUT', @maximum_row = @maximum_row OUTPUT 
  
  --If this process had already run, it may have already added some rows.
  --Determine where to start.
  --SET @starting_row = IsNull((SELECT Sum(row_count) + 1
  --FROM track_large_loads),1)
  SET @starting_row = 1
  
  -- build insert where clause
  SELECT @table_columns_command =  N'SELECT @table_columns = COALESCE(@table_columns + '', '', '''') '
	+ N' + column_name '
		+ N'FROM information_schema.columns '
		+ N'WHERE TABLE_NAME = @FromTable'
  EXECUTE sp_executesql @table_columns_command,  N'@table_columns NVARCHAR(max) OUTPUT, @FromTable VARCHAR(100)', @table_columns = @table_columns OUTPUT, @FromTable = @FromTable
  PRINT 'table_columns: ' + @table_columns
  -- build primary key list for using row-count
  -- this info needs to come from the old table
  -- a Pk should exist to increase performance
  SELECT 
	@primary_keys_command = N' SELECT @primary_keys = COALESCE(@primary_keys + '', '', '''') '
	+ N'+ KU.COLUMN_NAME '
		+ N'FROM '
				+ N'INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS TC '
				+ N'INNER JOIN '
				+ N'INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS KU '
				+ N'ON TC.CONSTRAINT_TYPE = ''PRIMARY KEY'' AND '
				+ N'TC.CONSTRAINT_NAME = KU.CONSTRAINT_NAME '
				+ N'WHERE KU.TABLE_NAME = @FromTable '
				+ N'ORDER BY KU.TABLE_NAME, KU.ORDINAL_POSITION'
  EXECUTE sp_executesql @primary_keys_command,  N'@primary_keys NVARCHAR(max) OUTPUT, @FromTable VARCHAR(100)', @primary_keys = @primary_keys OUTPUT, @FromTable = @FromTable
  PRINT 'primary_keys: ' + @primary_keys
  --Continue looping until all  records have been processed
  SET NOCOUNT OFF
  WHILE @starting_row <= @maximum_row
  BEGIN
	SET @message = 'Processing next '+
	  Cast(@Rows_Per_Iteration AS VARCHAR(100))+
	  ' records starting at record '+
	  Cast(@starting_row AS VARCHAR(100))+' (of '+
	  Cast(@maximum_row AS VARCHAR(100))+') from table ' 
	  + @FromTable + ' to table  ' + @ToTable
	PRINT @message
   
	-- build needed variables
	SET @stop_row = @starting_row + @Rows_Per_Iteration - 1
	SET @start_time = GetDate()
	
	--Insert the next set of rows 
	-- and log it to the log table in the same transaction
	BEGIN TRANSACTION
	--build statement 
		SELECT @insert_variables = '@starting_row BIGINT, @stop_row BIGINT'
		SELECT @insert_command = N'insert into ' + @ToTable 
		  + N' ( ' + @table_columns  + N') SELECT ' + @table_columns + N' FROM ' 
	  + N'( SELECT *, '
		   + N' Row_Number() OVER (ORDER BY ' + @primary_keys + N') AS rownumber'
		   + N' FROM ' + @FromTable + N') as old_table'
	  + N' WHERE rownumber BETWEEN @starting_row  AND @stop_row'
-- PRINT 'insert command: ' + @insert_command

	-- run statement
		EXECUTE sp_executesql @insert_command, @insert_variables, @starting_row = @starting_row, @stop_row = @stop_row
	--Log the records that have been added
	SELECT @rows_added = @@RowCount
	INSERT track_large_loads (table_name,start_time,end_time,row_count)
	VALUES (@ToTable,@start_time, GetDate(), @rows_added)
	COMMIT TRANSACTION
	SET @starting_row = @starting_row + @rows_added
  
  END --end while statement
END --end stored proc

You can downloaded it.

You can call the SQL by running the following style command:

exec sp_large_load @FromTable = 'men_srl_v831n', @ToTable = 'men_srl', @Rows_Per_Iteration = '100'