Tuesday, March 17, 2015

Delayed Write Failed

"Delayed Write Failed" error when an I/O stress test runs against a Windows Server 2012 failover cluster from a Windows 8-based or Windows Server 2012-based client


Article ID: 2842111 - View products that this article applies to.
System TipThis article applies to a different version of Windows than the one you are using. Content in this article may not be relevant to you.Visit the Windows 7 Solution Center

Collapse imageOn This Page

Collapse imageSymptoms

Considering the following scenario:
  • You have a Windows Server 2012 failover cluster that is configured by using continuously available file shares.
  • An I/O stress test is running on a Windows 8 or Windows Server 2012-based client against the failover cluster. The stress test has a high ratio of open and close operations to data operations. For example, the test repeatedly opens a file on the file share, reads the file, and then closes the file.

    Note This scenario may be found in stress tests but does not map directly to customer-usage scenarios.
In this scenario, you may experience I/O errors during failover. Additionally, the following event may be logged in the System log:

Event ID: 50
Event Source: Mup
Description: {Delayed Write Failed} Windows was unable to save all the data from the file <file name>.The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.

Collapse imageCause

When a file on the file share is opened, a file handle is created. After the file is closed, the Server Message Block (SMB) redirector will cache the file handle for a short time. However, there is a limit on the number of handles that can be cached in this manner. During the stress test, the SMB scavenger can fall behind in closing the cached handles. This may result in a large backlog of handles. Eventually, the number of handles exceeds the limit that can be failed over within the continuous availability time-out and some I/O operations may fail. By default, the continuous availability time-out is 60 seconds.

Collapse imageResolution

This hotfix is also available at Microsoft Update Catalog.

Hotfix information

Description: clip_image005
A supported hotfix is available from Microsoft. However, this hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing the problem described in this article. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next software update that contains this hotfix.

If the hotfix is available for download, there is a "Hotfix download available" section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.

Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the following Microsoft Web site:
Note The "Hotfix download available" form displays the languages for which the hotfix is available. If you do not see your language, it is because a hotfix is not available for that language.

Prerequisites

To apply this hotfix, you must be running Windows 8 or Windows Server 2012.

Restart requirement

You must restart the computer after you apply this hotfix.

Hotfix replacement information

This hotfix does not replace a previously released hotfix.

File information

Collapse imageStatus

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

Collapse imageMore information

You can use the following Windows PowerShell command to obtain the number of handles on the SMB client:
Get-SmbConnection
When the issue occurs, the number of handles is much more than expected. For example, you may see more than 10,000 handles although the expected number is about 100 handles.

Hardware configuration and performance may affect the threshold at which the issue occurs.

For more information about software update terminology, click the following article number to view the article in the Microsoft Knowledge Base:
824684 Description of the standard terminology that is used to describe Microsoft software updates

No comments:

Post a Comment