Your browser (Internet Explorer 7 or lower) is out of date. It has known security flaws and may not display all features of this and other websites. Learn how to update your browser.

X

Master database error after applying SQL Server 2008 R2 SP2

When applying SP to SQL Server you need to be careful, I mean really, with planning and attention to every step and especially to errors if any appear.

While I know perfectly the meaning of this sentence I was not as careful as needed applying the SQL Server 2008 R2 SP2, because it was on my laptop, and I did couple of other things meanwhile including listening to my baby girl first songs 🙂

So I run the SP2 installation, went trough the Wizard and started the process. I even forgot to close the SSMS and the installation process kindly reminded me that it didn’t have control over SSMS. OK, so I closed it and started the patch again. After several minutes I saw the lovely green icon and the message of successful installation. Great! One more task from the checklist is done, while of course thinking about completely different thinks and watching my baby’s first steps! BUT when I started the SSMS and tried to login the “nice” messages came in.

First, I couldn’t connect to my SQL Server (network protocol message error message + something like it is in the process of updating and only the administrator could connect (!) That’s me! I am the administrator!..). Ok, let’s check Config Mngr – the service is stopped! Strange, it shouldn’t be. I tried to start the SQL Server Service, and it started successfully. Tried to logon again, but couldn’t! The same message. Well, probably something with the registry, I though, pending restart.. wait! It shouldn’t require a restart. Ok, not thinking a lot, I restarted the machine, it’s my laptop, no problem, probably something in the registry was messed up. Trying to logon again and the same network protocol or no service message. (Restarting the SQL Server or the machine rarely helps, so don’t do it as a try-something” step, I still rarely can resist to the common “restart rule”)

OK dear SQL Server, you grabbed my attention!

I checked the log (always start from this step) and ..WOW!

Error: Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Wow!! Of course I don’t have a backup of the master database, it is just a test and dev SQL Server Installation on my laptop. Hm, Let’s go trough the log:

2012-11-08 19:51:42.94 spid7s      Error: 17066, Severity: 16, State: 1.
2012-11-08 19:51:42.94 spid7s      SQL Server Assertion: File: <AllocPageRef.cpp>, line=585 Failed Assertion = ‘IsSGAMPage()’. This error may be timing-related. If the error persists after rerunning the statement, use DBCC CHECKDB to check the database for structural integrity, or restart the server to ensure in-memory data structures are not corrupted.

2012-11-08 19:51:42.94 spid7s      Error: 3624, Severity: 20, State: 1.
2012-11-08 19:51:42.94 spid7s      A system assertion check has failed. Check the SQL Server error log for details. Typically, an assertion failure is caused by a software bug or data corruption. To check for database corruption, consider running DBCC CHECKDB. If you agreed to send dumps to Microsoft during setup, a mini dump will be sent to Microsoft. An update might be available from Microsoft in the latest Service Pack or in a QFE from Technical Support.

Looks like an allocation error and corruption. Next please:

2012-11-08 19:51:42.94 spid7s      Error: 912, Severity: 21, State: 2.
2012-11-08 19:51:42.94 spid7s      Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 3699, state 2, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.

Nice! The final script of the patching process couldn’t finish, probably because of master database corruption?! Or the SP corrupted my master database!? Or found something about my master db that I didn’t know about?! And what’s next:

2012-11-08 19:51:42.94 spid7s      Error: 3417, Severity: 21, State: 3.
2012-11-08 19:51:42.94 spid7s      Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.
2012-11-08 19:51:43.72 spid7s      SQL Trace was stopped due to server shutdown. Trace ID = ‘1’. This is an informational message only; no user action is required.

Let’s go back to the log and read carefully when and how and when exactly this has happened. Just before taking the dump the process is trying to open the database testcorruption and it couldn’t . And because it couldn’t open it it couldn’t upgrade the metadata. And it couldn’t complete the whole process

Is this the main problem?! My database testcorruption is one of several databases that I use to test and demonstrate the corruption survival techniques in my classes. And of course it is corrupted, I made an allocation page corruption myself. Anyway, the SP shouldn’t leave the things in this way. The problem is in my user database, but it shouldn’t reflect the overall Instance update. If this is the case then there is definitely a bug (or unwanted behavior) with this SP.

Actually because you cannot connect to SQL Server you cannot run DBCC CHECKDB. Beside that when there is such kind of a problem (database cannot be opened by the SQL Server process), then the database most probably cannot be recovered by DBCC CHECKDB. I manage to start the SQL Server, which means that master database is ok, so I can use DAC in order to try to fix the testcorruption. Actually I just didn’t waste the time to do that, because I know the error is unrecoverable. So I just dropped the db files from the folder, restarted the SQL Server service in order to force the SP to complete and it worked. After that I was able to connect and my SQL Server was working fine. Strange, isn’t it?!

Note! : I am not saying that this is your case and you should drop database files in order to fix it!

I tried to reproduce the problem in VM applying and removing Service packs, trying different scenarios of database corruption with no success. which is even more strange, because it could mean that there are some additional factors leading to this SP error, not just facing recovery pending of user db.

What you could do if you encounter the problem:

  1. Read the log very carefully! Do not try any steps before you familiarize yourself with the error log!
  2. The case is registered @ connect with no workaround nor solution provided. But till this moment I found that there are two common reasons (other than my case) for this error – “wrong folders” and “implicit transaction mode”. Check them first:
    1. Check if it is “wrong folders” problem: http://blogs.msdn.com/b/sqlserverfaq/archive/2010/10/27/sql-server-2008-service-fails-to-start-after-applying-service-pack-1.aspx
    2. Check if it is “Implicit transaction mode” problem: http://social.msdn.microsoft.com/Forums/en-SG/sqlsetupandupgrade/thread/2a1ba884-a34f-401b-9510-2d278f0cb607

We have an option to uninstall the SP from the Control Panel | Uninstall Program (from SQL Server 2008) but I am afraid that it is an option when the SP is installed successfully with no problems at all.

I cannot finish this post without giving recommendations. Here are some advices and steps to perform BEFORE applying any patches:

  1. Ensure that your databases are up and running and in a health status (check status in sys.databases first, run DBCC CHECKDB, or if it is part of your maintenance plan check the last messages)
  2. Have backups of user and system databases and check them before applying SP!
  3. Check if there is a Connect item for this SP or any problem in advance and perform respective steps in order to reduce the risk

Good luck and thanks for reading! 🙂