Deleting data from a very large SQL table

Created by Jeremy Burgess, Modified on Mon, 10 Jul, 2023 at 4:47 PM by Vasanth Paratharajan

Symptoms

When it's necessary to clear down rows from a large table, for instance when a server is running out of space, its possible to run out of space while deleting data.

Cause

Most operations on SQL server will cause a transaction to be logged, even when the database is in SIMPLE recovery mode (i.e. does not retain transaction logs). SIMPLE recovery just means that the transaction log is recycled on every operation. A large operation will cause the [temporary] log to grow based on the size of the data change.

Resolution

Remember that the most efficient way to remove all data from a table is to TRUNCATE but often it's only some records that need to be deleted (e.g. older than 'n' days). One strategy to consider is to copy the data you want to retain into a #temp table, truncate the original and then re-insert the old data.


Alternatively, break a large delete/update operation into more maneagable chunks. This also benefits other users/processes accessing the server because a series of small requests will allow other requests to happen in between. Each of the 'transactions' are very small so do not fill the transaction log disk.


A sample script for deleting 'n' rows at a time is below. Note that it has additional progress reporting which makes operation slower but gives better observability for the result.

DECLARE @n INT = 10000;
DECLARE @r INT;
DECLARE @totalRows INT = 0;
DECLARE @rowsPerMin FLOAT = 0;
DECLARE @elapsed FLOAT = 0;
DECLARE @totalElapsed FLOAT = 0;
DECLARE @msg NVARCHAR(100);
DECLARE @opStart DATETIME = GETDATE();
DECLARE @loopStart DATETIME;
DECLARE @dayThreshold INT = 14;

SET @r = 1;

WHILE @r > 0
BEGIN
	--BEGIN TRANSACTION;
	SET @loopStart = GETDATE();

	DELETE TOP (@n)
	FROM [mailroom.api].[dbo].[Logs]
	WHERE [Level] LIKE 'Information'
		AND [timestamp] < GETDATE() - @dayThreshold;

	SET @r = @@ROWCOUNT;
	SET @totalRows = @totalRows + @r;
	SET @elapsed = DATEDIFF(millisecond, @loopStart, GETDATE());
	SET @totalElapsed = DATEDIFF(second, @opStart, GETDATE());
	SET @msg = CONCAT (
			FORMAT(@r, 'N0')
			,' deleted in '
			,@elapsed
			,'ms @ '
			,CAST(@r / (@elapsed / 1000) AS INT)
			,' per sec ('
			,FORMAT(@totalRows, 'N0')
			,' in '
			,CAST(@totalElapsed / 60 AS DECIMAL(6, 1))
			,'mins, average '
			,IIF(@totalElapsed > 0, CAST(@totalRows / @totalElapsed AS INT), 0)
			,' per sec)'
			);

	RAISERROR (
			@msg
			,10
			,1
			)
	WITH NOWAIT;
		--COMMIT TRANSACTION;
END

With sample output:

100 deleted in 110ms @ 909 per sec (3700 in 0.2mins, average 308 per sec)
100 deleted in 830ms @ 120 per sec (3800 in 0.2mins, average 292 per sec)
100 deleted in 703ms @ 142 per sec (3900 in 0.2mins, average 300 per sec)
100 deleted in 500ms @ 200 per sec (4000 in 0.2mins, average 285 per sec)
100 deleted in 4584ms @ 21 per sec (4100 in 0.3mins, average 215 per sec)
100 deleted in 670ms @ 149 per sec (4200 in 0.3mins, average 221 per sec)
100 deleted in 173ms @ 578 per sec (4300 in 0.3mins, average 226 per sec)
100 deleted in 30ms @ 3333 per sec (4400 in 0.3mins, average 231 per sec)
100 deleted in 173ms @ 578 per sec (4500 in 0.3mins, average 225 per sec)
100 deleted in 30ms @ 3333 per sec (4600 in 0.3mins, average 230 per sec)

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article