Sunday, March 22, 2015

TCP Offloading/Chimney & RSS…What is it and should I disable it?

TCP Offloading/Chimney & RSS…What is it and should I disable it?


Having decided to start this blog to convey my experience with network analysis and troubleshooting, one subject instantly sprang to mind for my first post, TCP Chimney/Offloading.
I get asked about this so often I have a ready email of advice around what it is, and what I (note not Microsoft, although I think our official recommendation would likely mirror mine) recommend to customers around the use of it. My advice is based on thousands of customer cases I've handled over the years around this feature. I've therefore compiled what's hopefully a one-stop shop for all your TCP offloading needs. Apologies in advance if it's a bit wordy, but I've tried to convey everything I can around the subject for you.
So why do I get asked about it all the time? Well, let's start with what it is and what it does.

What is TCP Offloading/Chimney?
Starting when Windows Server 2003 SP1 was the current server OS, Microsoft released the Scalable Networking Pack http://support.microsoft.com/kb/912222/en-us
This turned on in the OS, three distinct things:

1.) RSS
Where multiple CPUs reside in a single computer, the Windows networking stack limits "receive" protocol processing to a single CPU. In essence, the old design of dealing with all incoming network traffic on a single processor core was starting to cause a bottleneck on newer multiprocessor systems. RSS resolves this issue by enabling the packets that are received from a network adapter to be balanced across multiple CPUs. In essence with RSS on, each incoming TCP connection is load balanced over the available cores, spreading the load and preventing a bottleneck from occurring. This is becoming a necessity as servers have to handle increasingly high loads of network traffic.

2.) TCP Chimney (sometimes referred to as TCP Offloading)
This feature is designed to take processing of the network such as packet segmentation and reassembly processing tasks, from a computer's CPU to a network adapter that supports TCP Chimney Offload. This has the effect of reducing the workload on the host CPU and moving it to the NIC, allowing both the Host OS to perform quicker and also speed up the processing of network traffic.

3.) Network Direct Memory Access (NetDMA)
http://technet.microsoft.com/en-us/library/gg162716(WS.10).aspx
The NetDMA interface provides generic access to direct memory access (DMA) engines that can perform memory-to-memory data transfers with little CPU involvement. Again, this is designed to take work away from the CPU by allowing the NIC to move data from receive buffers without using the CPU as much.
Why would I want to disable it then?
All these features sound brilliant, and only enabled with the installation of the Scalable Networking Pack, so why would you want to disable it?
Well, with the release of Service Pack 2 for Windows Server 2003, Microsoft decided to include this scalable networking pack and thus turn the features on. If a server has a NIC which supports these features, and it's enabled in the NIC properties (more on this later) then we'll use them. The problem with this approach was, that many NICs reported to the OS that they supported these features, which they indeed did, but many didn't perform these functions very well at all in reality. With 2003 this was an all or nothing effort, if offloading was turned on, we offloaded all TCP connections on a supported NIC, regardless of whether it would benefit or not.
We had a great number of issues through Microsoft support over the course of the next year or so which were caused by drivers misbehaving with these features which had a knock on effect on the network traffic. This in turn caused a wide range of weird and wonderful symptoms seen across the board from the Exchange team to SQL to BizTalk to IIS to ISA. Most of these landed on the lap of my colleagues and I in the Networking support team and as such we've probably seen issues numbering in the thousands caused by these features, where turning them off resolved the problem, so much so that turning off Offloading/RSS became an almost standard troubleshooting step with 2003 cases.
Most NIC vendors have released numerous updates over the years to resolve these issues as have Microsoft to improve the feature in the OS, however my general advice with Server 2003 is to disable these features.
In essence, with server 2003 this feature tends to cause more problems than it solves so I tend to recommend it is turned off. It doesn't really become effective until you get high speed (10gbps) networks and low latency and, as mentioned, many older NIC drivers don't implement this well and thus we get a lot of issues. If enabled, the bare minimum recommendation would be that the NIC and teaming drivers and firmware are the latest available and http://support.microsoft.com/kb/950224/en-us is applied.

How do I turn off these features in Server 2003?
If the features aren't needed then they should be turned off, we can do this in one of two ways.
The safest way is to do this in the registry by using method 3 in http://support.microsoft.com/kb/948496/en-us which sets the registry to configure these features to be off.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

  • Right-click EnableTCPChimney, and then click Modify.
  • In the Value data box, type 0, and then click OK.
  • Right-click EnableRSS, and then click Modify.
  • In the Value data box, type 0, and then click OK.
  • Right-click EnableTCPA, and then click Modify.
  • In the Value data box, type 0, and then click OK.
  • Exit Registry Editor, and then restart the computer.

Alternatively you can disable the features in the NIC properties of most NICs, however the naming convention and exposure of these settings varies from NIC to NIC and also from driver to driver.


If the feature is disabled on either the NIC properties, or the registry, it's off, regardless if the other is on. This is why I recommend you use the registry as it won't be affected by driver updates etc and is much easier to control centrally.

So what about newer OS versions?
The first point of note is that Microsoft made a lot of effort in the newer OS versions to ensure the drivers were up to the job, and also a lot of improvements in the implementation of the features and TCP stack in the OS which makes the enabling of the features much safer in post 2003 OS versions.

Server 2008
In Windows Server 2008 (not R2) Offloading is turned off by default anyway. http://support.microsoft.com/kb/951037/en-us
You can enable it using the NIC properties or by using Netsh which is outlined in the link above. The offloading capabilities are more granular in 2008 than they were in 2003, we offload suitable network connections.
As before, it's always wise if using this feature that the latest NIC drivers and firmware are installed to ensure the NIC manufacturers latest updates are in place.
Ensure you have http://support.microsoft.com/kb/976035/en-us installed on top of SP2 to prevent an unexpected restart scenaro.
If the above steps are done, in my experience it's very safe to turn the feature on in 2008 if you feel it is needed.
To check the state of offloading you can run the following steps:

Run Netstat –t in a command prompt an you'll get the following output:

Active Connections

Proto Local Address Foreign Address State Offload State
TCP 127.0.0.1:52613 computer_name:52614 ESTABLISHED InHost
TCP 192.168.1.103:52614 computer_name:52613 ESTABLISHED Offloaded

InHost shows the connection is not offloaded and thus handled by the OS, Offloaded mean exactly that.
For those of you looking in memory dumps and wondering if these features are in use, you should be able to dump the registry keys used to set this by running x tcpip!*disable* in windbg for server 2003.
In this example, both RSS and TCP Chimney is disabled.
x tcpip!*disable*
b8f1a0d4 tcpip!DisableRSS = 1
b8f1a360 tcpip!DisableUserTOSSetting = 1
b8f1df34 tcpip!DisableMediaSenseEventLog = 0
b8f1a0d0 tcpip!DisableTCPChimney = 1
b8f1ae54 tcpip!DisableTaskOffload = 0
b8f1cdc0 tcpip!DisableLargeSendOffload = 0
b8f1a0b4 tcpip!DisableIPSourceRouting = 2
b8f1ae4c tcpip!DisableMediaSense = 0
b8f1a0ec tcpip!DisableUserTOS = 1
b8f01ff3 tcpip!DisableRouter (void)
b8f0c3b0 tcpip!IPDisableMediaSenseRequest (struct _IRP *, struct _IO_STACK_LOCATION *)
b8f106d6 tcpip!OlmDisableOffloadOnInterface (unsigned int)
b8f04d4b tcpip!IPDisableChimneyOffload (struct _IRP *, struct _IO_STACK_LOCATION *)
b8f048bf tcpip!IPDisableSniffer (struct _UNICODE_STRING *)

As these structures don't exist in 2008 + you'll need to use another command which im currently trying to confirm what the best method is and I'll update the blog with the info.

Server 2008 R2
With server 2008 R2 this feature is much more intelligent, it'll only offload when the conditions are right..as per http://technet.microsoft.com/en-us/library/gg162709(WS.10).aspx
Automatic. In automatic mode, TCP Chimney Offload considers offloading the processing for a connection only if the following criteria are met: the connection is established through a 10 Gbps Ethernet adapter, the mean round trip link latency is less than 20 milliseconds, and at least 130 KB of data has been exchanged over the connection. In automatic mode, the TCP receive window is set to 16 MB. Because the Windows stack has performance optimizations not found in Chimney-capable network adapters, automatic mode restricts offloads only to those connections that might receive the most benefit from it.
This is the default setting and I'd advise it's left as default. As always, ensure the latest NIC drivers/Firmware is installed to remove the risk of any known issues but in my experience taken from many thousands of customers, this feature is a real benefit to the OS. In fact I've seen multiple customers who have gotten into the habit of disabling these features in their OS build following the issues they experienced with Server 2003. I've been called out to look at performance issues and when we've re-enabled the features we notice a massive performance improvement.
If you are getting problems which are resolved by turning off TOE in 2008 R2, my first step would be to update the NIC driver and firmware as there are almost always updates for the NICs which resolve the majority of offloading issues I encounter.
If the problem persists, turning off Offloading is the wrong thing to do, raise a case with Microsoft and we'll help you get to the bottom of it, by having a policy of disabling these features, you are effectively restricting your Windows platform's network performance for the sake of one or two issues which could be investigated and resolved.
The performance improvement on certain connections is enormous and shouldn't be thrown away due to habit (i.e. the 2003 behaviour) or a few issues which haven't been fully investigated, quick fixes will eventually come round and bite you, in my personal experience.
To manage the settings in 2008 R2 the following KB gives more information on the Netsh commands available.
http://technet.microsoft.com/en-us/library/gg162682(v=ws.10).aspx
It's also advised to install http://support.microsoft.com/kb/2477730/en-us to resolve an issue with offloading in 2008 R2, this is non urgent so could be planned into your next change window.
Server 2012
Offloading in Windows Server 2012 works much as it does in server 2008 R2 so the same advice applies. RSS however becomes more important in this OS due to the fact SMB Multichannel relies on it.
http://support.microsoft.com/kb/2846837/en-us is a recommended hotfix for RSS in server 2012.
Additional points of note around offloading:
Large send offload and checksum offload
I've seen many references on the internet pointing to things around TCP task offloading, such as Checksum offloading and Large Send offload being related to TCP chimney. Its important to note, these are not related to the TCP Chinmey/offload described above. Checksum offload is where we allow the NIC to set the checksum on a packet when it leaves the machine (which is why Netmon and Wireshark often show "incorrect checksum" on packets as the driver which captures them sits above the NIC where the checksum is set). Large Send offload (LSO) allows the Application layer to dump down a packet which would be too big for transmission and allows the NIC to chop it up into transmittable sizes (which is why you can see packets > 1460 bytes of payload in Netmon/Wireshark).
These can be set in the NIC properties but are generally very very safe to leave on. You may want to disable LSO if you're sniffing traffic as you wont be seeing the packets as they are transmitted on the wire.



Network tracing offloaded connections:
Another reason you may want to disable TCP offloading is if you want to take a network trace. Both Netmon's filter driver and Wireshark's will show you only the three way handshake and the session tear down if offloading is being used. This is due to where the drivers sit, when offloading is used, the data bypasses these drivers so you'll only see the part of the session the OS is responsible for, the session setup and tear down.
http://blogs.technet.com/b/messageanalyzer/archive/2012/09/17/meet-the-successor-to-microsoft-network-monitor.aspx is a new tool from Microsoft which allows us to trace at different layers other than NDIS (where netmon sits) and thus may allow you to work round this issue depending on the scenario.
So for a short summary my recommendations for Offloading and RSS are:
Server OS version RSS/Chimney On by default Recommended setting Methods to disable Additional Recommendations
2003 SP2 Yes Turn off unless needed NIC properties or registry Update NIC drivers and apply
http://support.microsoft.com/kb/950224/en-us
2008 No Turn on only if needed Nic properties or Netsh Update NIC drivers and apply 
http://support.microsoft.com/kb/979614/en-us

http://support.microsoft.com/kb/967224/en-us
if enabled
2008 R2 Yes (Only offloads suitable connections) Leave enabled NIC properties or Netsh Update NIC drivers and apply
http://support.microsoft.com/kb/2958399
http://support.microsoft.com/kb/2511305
2012 inc R2 Yes (Only offloads suitable connections) Leave enabled Nic properties or Netsh Update NIC drivers and apply
http://support.microsoft.com/kb/2885978
Please note, the hotfixes may seem irrelevant but are the latest versions (as of 19 August 2014) of the relevant binaries which contain the code for handling RSS and Offloading and thus contain any hotfixes to this date which may help performance & functionality in this area.
  • Server 2003: Turn it off unless absolutely needed
  • Server 2008: Off by default, Turn on if needed after a NIC driver update and Windows hotfix.
  • Server 2008 R2/2012: On, Automatic mode by default, leave as default and update NIC drivers if possible. Install hotfix on 2008 R2 in next change window.

*Update.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055853 is a recommended fix if you're using RSS and VMware.

http://blogs.technet.com/b/onthewire/archive/2014/01/21/tcp-offloading-chimney-amp-rss-what-is-it-and-should-i-disable-it.aspx

No comments:

Post a Comment