Managing your SQL Log files - Death By Patterns

Managing your SQL Log files

Part 2, covering how the transaction log is affected by the recovery model and how it affects data loss, is now out 

OK, everyone who has put some thought into the setup of their transaction log file, please raise your hand...

 

...I thought so.

 

The sad truth is that there are very few people who think through the log files for their production systems - often with dramatic results.  Each and every single SQL Server database has both a data file and a log file, both of which serve an important function.  The data file stores the actual data in the database whilst the log file stores all changes made to the data stored in the data file.  Why?  What is the purpose of storing (basically) the same information twice?

why do we need a log file?

Log files help satisfy the ACID requirements of RDBMS's.  Just about every command that gets issued against your database is stored in the log file (there are some exceptions).  Some clarification: Every command you issue stores the following in the log file:

  • The start time for the transaction
  • Before and after images of the data
  • Whether each transaction was committed or not
  • All allocation changes to database pages
  • As well as a few other things...

Why? I mean, what is the point?  OK, example time.  You are a DBA responsible for maintaining a line of business database.  Being the conscientious DBA you are, you have scheduled regular full backups - twice a day in fact!  Now, image a situation where 3 minutes and 12 seconds before a backup is scheduled to run some twit of a developer forgets to put a where clause in his update statement (or you get a torn page or something, anything bad).  What do you do?

I'm going to assume that the developers transaction has been committed.  First and foremost, you decide to restore your last full backup... from a full half day ago.  Doing well so far, but where do you get the rest of the days transactions from?  The transaction log file, of course!  After you've restored your latest backup you can roll the transaction log forward to recover the rest of your transactions meaning that even though your most up to date backup was out of date, you haven't lost anything.  True story!

how does it work?

The reason this works is very simple.  SQL Server does not commit any changes to the data file unless they have been committed to the log file first.  This ensures that no matter what happens, no data is written to disk without there being a record in the transaction log.  This is known as a Write Ahead Log (WAL).  The process behind WALing works as follows: SQL maintains a cache of data pages which are to be read or updated.  Whenever data is modified, the cache is modified NOT the physical data file.  The modification is only written to the physical file once a checkpoint is reached or SQL needs to use the cache for something else.  Every modification results in an entry in the transaction log.  The transaction log entry is written to disk BEFORE the data modification.  If it were to happen the other way around, the data modification could create a transaction which you could not roll back from.

In this way, you have a guaranteed source of data as well as an exact plan of how to get there unless you mess your backups and checkpoints up... which brings us to the next point:

"my log file is too large!"

The most common complaint regarding log files is "My log file is too large!"  Granted, this is a problem, and I'm fairly sure many of you have already experience this problem, but I'm willing to bet that this has become a problem due to the way you've decided to manage your log file.  If you've left your database setup to someone who doesn't understand how log files work you are almost guaranteed to have this problem.

As mentioned before, log files are used to store a history of all transactions that have been performed against your database.  Straight away you would think this would cause a problem.  A record of every single transaction?  How is that NOT going to get extremely large?  Valid question, and the answer is in how the log file works.  Every time you make a backup of your database you are invalidating the need for a record of all the transactions before the backup.  Granted, you can't rollback transactions anymore (not entirely true) but you have a copy of the data somewhere other than in the data file.

In order to help SQL Server maintain the transactions which can have their logs discarded and the transactions which can't, the transaction log file is split into an Active and a Inactive portion.  The active portion cannot be altered as this is the portion of the log file which is used to roll-forward transactions that were not saved at the last checkpoint.  SQL Server keeps a pointer to the start of the active portion using a MinLSN (Minimum recovery Log Sequence Number).  This is a pointer to a transaction in the log file (not necessarily the start) which tells SQL Server that everything after the pointer is an Active Transaction, everything before the MinLSN has been written to disk and can be altered.

The MinLSN gets moved when one of the following occurs:

  • A BACKUP LOG command completes
  • A CHECKPOINT command completes (only in Simple recover model) - either by using the CHECKPOINT command, or by the system issuing a checkpoint.

CHECKPOINT's are cause by the following:

  • Executing the CHECKPOINT command
  • A transaction not logged completely is executed (Bulk inserts, Select Into's...)
  • A database backup is taken
  • The database is shutdown
  • Database files are altered in some way (ALTER DATABASE commands)
  • The database issues checkpoints based on the Recover Interval (SQL estimates how many logs can be processed during the Recovery Interval)
  • In Simple Recovery mode, when the log file becomes 70% full

Don't expect your log file to shrink in size simply because you issue a CHECKPOINT command even if you are in simple recovery mode, if you have any long running transactions SQL cannot commit it and therefore cannot update the MinLSN.

 

OK, so we know that log files help ensure that we have an up-to-date copy of the data at all times and we know how it does this.  We also know that we need to be careful about choosing our recovery model and our backup procedures, but we don't have any guidelines yet as to how to do this.  To that end, next time we'll be taking a look at the various recovery models and the way they affect the growth of the transaction log.  Until then though, take a look at your log files and the general sizes they go through during the day.  See if you can get it down to a relatively consistent size regardless of the type of transactions being issued against it.

Published Wednesday, July 11, 2007 7:19 AM by CalmYourself

Comments

# re: Managing your SQL Log files

Nice Post. Will read in more detail but definitly a subject not often discussed.

Wednesday, July 11, 2007 11:02 AM by Willie Roberts

# re: Managing your SQL Log files

Very Nice! Clears things up a bit :-)

We always have checkpoints and I figure it must be because our DB mode is simple and the recovery interval is 0.

I'm eager to read the next installment!!

Wednesday, July 11, 2007 6:06 PM by qwerty

# re: Managing your SQL Log files

Nice post. Logfiles demystified ...

I don't know how far you are familiar with Sharepoint 2007 but indeed, we are experiencing growing log files. In fact, as Sharepoint itself creates the database at the web application creation, we are not really sure about the strategy to adopt with logfiles in this situation.

I also know that we are not supposed to attack directly the database and in this optic, maybe executing transaction logs can be very dangerous. Thsi remark/question is mainly the same as for other "non self-developed" products using SQL Server.

What do you think about that ?

I'm really impatient to read your opinion/next post on this subject !

Thanks !

Thursday, July 19, 2007 5:04 PM by Diabolo

# re: Managing your SQL Log files

Unfortunately I'm not familiar with Sharepoint, at all!

Are there any automatic backups that Sharepoint performs on it's database?

Can you configure these at all?

As far as not altering the database directly... My personal opinion is that you should avoid making changes that affect the way the system behaves.  That being said, if a database (custom built or otherwise) is not setup correctly, I would (and have, to my company) recommended that the settings get changed on the database as long as it presents no additional data loss risk exposure.  Just remember to test THOROUGHLY!!

Hope part 2 helps you, it's more focused on how the recovery model affects log file growth.

Friday, July 20, 2007 8:12 AM by CalmYourself

# re: Managing your SQL Log files

@Diabolo:

Sharepoint 2007 creates its databases with 'full recovery' model. However, it has very basic backup support in which it only backups the databases, NOT the transactions logs.

Your logs will only grow, unless:

1. You use your own backup strategy in which you backup both the databases and the transaction logs yourself. This will however invalidate the 'restore' option in the Sharepoint Central Administration.

2. You let Sharepoint make the backups, but you will manually truncate the logs on scheduled intervals.  This can be scripted using the following T-SQL commands:

(just an example truncating a config db logfile to 70 MB):

BACKUP LOG [Sharepoint_Config] TO DISK='drive:\path\Yourbackupfile.bak'

GO

BACKUP LOG [Sharepoint_Config] WITH TRUNCATE_ONLY

USE [SharePoint_Config]

GO

DBCC SHRINKFILE (N'SharePoint_Config_log' , 70)

GO

If you don't care about a point-in-time restore, you can skip the first backup statement.

For detailed information, see: support.microsoft.com/.../907511

Tuesday, August 07, 2007 2:52 PM by SvenP

# re: Managing your SQL Log files

I'm an absolute novice at SQL and have inherited a server with 10gb disk space and onyl 80mb free.  I can see a 3.5gb file on C:\WINNT\SYSMSI\SSEE\MSSQL.2005\MSSQL\Data\Sharepoint_Config_a84bd...._log.mdf

Having trawled Google etc it appears I can run DBCC Shrink file to reduce this log file.  Being a completed novice at SQL I don't even know where to start running the DBCC.

I'm thinking I run it from Query  Analyzer?  If so, none of the databases listed use this lof file.  Could somebody please help?!!! How do I know what is using this file (Sharepoint would seem the obvious choice) but I can't find any reference to this log file.

Many thanks for all your time.

Thursday, April 17, 2008 3:43 PM by Mark

# The Schlog » Blog Archive » Installing WSS 3.0 on SBS 2003 Standard

Pingback from  The Schlog  » Blog Archive   » Installing WSS 3.0 on SBS 2003 Standard

Leave a Comment

(required) 
(required) 
(optional)
(required) 

Enter the numbers above: