SQL Server ghost records – in a nutshell
I was completely unaware of the SQL Server “ghost records” until recently I’ve started using a table in SQL Server database to temporarily store and then delete generated PDF documents as part of a nightly job. Because of a large amount of data being processed (inserted and then deleted) few days later our DBA realized that the database data files started growing in a rhythm of 10GB per day without any apparent reasons.
What is a Ghost Record
“Ghost Records” are records that have been logically but not physically deleted from the leaf level of an index. Such a delete operation never physically removes records from pages but it only marks them as having been deleted, or ghosted!
Ghost Records have been introduced as a performance optimization that makes the delete operations quicker. Rolling back delete operations are quicker because the only things that needs to happen is to un-mark the records as being deleted/ghosted, instead of having to reinsert the deleted records.
“imagine you have a unique index on an integer and the index contains the values 1, 30, and 100. If you delete 30, SQL Server will need to lock (and prevent inserts into) the entire range between 1 and 100. With ghosted records, the 30 is still visible to be used as an endpoint of a key-range lock so that during the delete transaction, SQL Server can allow inserts for any value other than 30 to proceed.”
Ghost Records removal
There are several ways of cleaning out the ghost records, among which :
- If a new record that has the same key value as the deleted record is re-inserted in a table
- The Ghost cleanup task (scheduled to run once every 5 seconds)
- If the page needs to be split, the ghost records will be handled
In my case, because the database in question is SQL Server 2005, the most effective way of cleaning up the database was:
- Drop and recreate the table (very ugly!)
- Shrink the database
- Rebuild indexes
- Recompile all of the stored procedures (for quicker response time after shrinking)
The recommended way to do this is to rebuild indexes on table that has problems, but for unknown reason this didn’t work properly for me.
How to find out about ghost records
SELECT ghost_record_count , version_ghost_record_count , * FROM sys.dm_db_index_physical_stats(DB_ID('DATABASE_NAME_HERE'), OBJECT_ID('TABLE_NAME_HERE'), NULL, NULL, 'DETAILED')
version_ghost_record_count will indicate a number of ghost records associated with the given database table.
The topic is quite deep and unfortunately not so well documented, so I collected some useful urls that could be of benefit posted below.
I think that it worth trying and checking if your database is affected by this, especially if you see the database growing for unknown reason.