At the end of 2007 we talked about Bugchecks and why they happen. Today we're going to talk about the Crash Dump files themselves - the different types of dumps, how the dumps themselves are generated and why you will need a correctly sized page file. So, let's get started ...
By default, all Windows systems are configured to attempt to capture information about the state of the operating system in the event of a system crash. Remember that we are talking about a total system failure here, not an individual application failure. The settings for the dump files are configured using the System tool in Control Panel. Within this tool, select System Properties - on the Advanced tab there is a section for Startup and Recovery. Clicking on the Settings button brings up the dump file options as shown below. There are three different types of dump that can be captured when a system crashes:
Complete Memory Dump: This contains the entire contents of the physical memory at the time of the crash. This type of dump will require that there is a page file at least the size of physical memory plus 1MB (for the header). Because of the page file requirement, this is an uncommon setting especially for systems with large amounts of RAM. Windows NT4 only supported a Complete Memory Dump. Also, this is the default setting on Windows Server systems.
Kernel Memory Dump: A kernel dump contains only the kernel-mode read / write pages present in physical memory at the time of the crash. Since this is a kernel-mode only dump, there are no pages belonging to user-mode processes. However, it is unlikely that the user-mode process pages would be required since a system crash (bugcheck) is usually caused by kernel-mode code. The list of running processes, state of the current thread and list of loaded drivers are stored in nonpaged memory that saves in a kernel memory dump. The size of a kernel memory dump will vary based on the amount of kernel-mode memory allocated by the Operating System and the drivers that are present on the system.
Small Memory Dump: A small memory (aka Mini-dump) is a 64KB dump (128KB on 64-bit systems) that contains the stop code, parameters, list of loaded device drivers, information about the current process and thread, and the kernel stack for the thread that caused the crash.
Something to note here - although the need for a complete memory dump is rare when dealing with bugchecks, a complete memory dump is almost always required for manually generated crash dumps used to diagnose soft hangs on a system (for more information regarding the difference between a soft and hard hang, please see our Troubleshooting Server Hangs - Part One). This is because when looking at soft hangs we will need to look at user-mode processes, deadlocks etc. However, regardless of which type of dump you are capturing, there must be a correctly sized page file on the boot volume. For complete dumps, as stated above, this page file will need to be Physical RAM + 1MB.
So in reviewing the three types of dumps above, the kernel memory dump offers the most practical option when dealing with system crashes and bugchecks. Remember that the size of the kernel memory dumps will vary depending on the amount of kernel-mode memory allocated and the drivers loaded. On systems with more RAM, it is reasonable to expect that the dump file will be larger. There is no way to predict the exact size of a kernel memory dump. When you configure kernel memory dumps the system checks to see if the page file is large enough. There are some guidelines for the minimum page file size needed for kernel memory dumps, however given that the size of kernel mode memory will vary, there is no accurate measure for the maximum. The default minimum page file sizes for kernel dumps are shown below:
Physical RAM | Minimum Page File Size (Kernel Dump) |
< 128MB | 50MB |
< 4GB | 200MB |
< 8GB | 400MB |
>= 8GB | 800MB |
In addition to correctly sizing the page file, you also need to ensure that you have sufficient free disk space for the actual dump file itself to be written. Unlike the page file used to capture the dump, the dump file itself can be written to a different local volume by changing the location in the Dump File field. If there is a need to maintain multiple dumps of an issue, then you should uncheck the "Overwrite any existing file" box as well. However, please remember that this may put a strain on free disk space over time.
Let's take a quick moment and talk about how the dump files themselves are generated. When a system boots up, it checks the crash dump options in the HKLM\System\CurrentControlSet\Control\CrashControl registry key. All of the settings available in the GUI can be modified via the registry as shown below:
- Write an event to the System Log checkbox = LogEvent
- Automatically Restart checkbox = AutoReboot
- Write Debugging Information drop-down = CrashDumpEnabled
- Dump File text box = DumpFile
- Overwrite any existing file checkbox = Overwrite
A quick tangent here - if you have a system with more than 2GB of RAM, the option for a complete memory dump is not available in the GUI drop down as you can see from this image. This behavior is described in Microsoft KB Article 274598. It is possible to enable a complete memory dump by modifying the CrashDumpEnabled value in the HKLM\System\CurrentControlSet\Control\CrashControl registry key to 1. Note that this will still not show the option for a complete memory dump in the GUI. If you need a complete memory dump for troubleshooting specific issues, then you may want to consider using the MAXMEM switch in the boot.ini file on 32-bit systems to limit the amount of RAM in use by the Operating System to 2GB or less (see Microsoft KB Article 108393 for details). This will then display the option for a complete memory dump. In addition, this will allow the dump file to be created quicker, and reduce the amount of downtime. This is ideal for troubleshooting scenarios - not for long-term usage - as you are limiting the RAM available to the system.
Returning to the subject of how the dump file itself is generated, If a dump is configured, the system makes a copy of the disk miniport driver used to write to the boot volume in memory and prepends the driver name with "dump_". The system also checksums all of the components involved with writing a crash dump, (including the copied disk miniport driver), the I/O manager functions that write the dump and the map of where the boot volume's page file is on the disk. This checksum is saved. When the KeBugCheck function executes it checksums these components again and compares this checksum to the one created at boot. If these checksums do not match, no dump file is written (because of the risk of corrupting the disk). If the checksum matches, the dump information is written directly to the sectors on disk occupied by the page file. The file system driver is completely bypassed - because it may be corrupted or be the cause of the crash. When SMSS.EXE enables paging during the boot process, the system examines the boot volume's page file to see if there is a crash dump present. If one exists, then this part of the page file is protected. This makes all (or part) of the boot volume's page file unusable during the early part of the boot process. This may result in notifications that the system is low on virtual memory - a temporary condition. Later in the boot process, WINLOGON.EXE calls the SAVEDUMP.EXE process to extract the dump from the page file and copy it to the final location that is specified in the Dump File field.
On Windows Server 2003, there is some slightly different behavior that is outlined in KB Article 886429. Following the server reboot after the bugcheck, Windows requires a temporary file on the boot volume equal to the size of physical RAM. If there is insufficient disk space to meet this requirement, the dump file is still generated, however the page file size on this volume is reduced. In the first stage of the dump operation, the Session Manager Subsystem (SMSS.EXE) examines the page file head block to determine whether the file is a valid memory dump. If the file is valid, then SMSS.EXE truncates the page file to the size of the dump file and renames the file to Dumpxxx.tmp (the xxx value is calculated from the Lower Word of the tickcount function). SMSS stores the Dumpxxx.tmp file on the boot volume and sets a TempDestination value and a DumpFile value in a volatile registry subkey (HKLM\System\CurrentControlSet\Control\CrashControl\MachineCrash). SAVEDUMP.EXE reads this registry location to determine if a valid memory dump exists and copies the Dumpxxx.tmp file to Memory.dmp.
No comments:
Post a Comment