TCP Offloading/Chimney & RSS…What is it and should I disable it?
Having
decided to start this blog to convey my experience with network
analysis and troubleshooting, one subject instantly sprang to mind for
my first post, TCP Chimney/Offloading.
I get asked about this so often I have a
ready email of advice around what it is, and what I (note not Microsoft,
although I think our official recommendation would likely mirror mine)
recommend to customers around the use of it. My advice is based on
thousands of customer cases I've handled over the years around this
feature. I've therefore compiled what's hopefully a one-stop shop for
all your TCP offloading needs. Apologies in advance if it's a bit wordy,
but I've tried to convey everything I can around the subject for you.
So why do I get asked about it all the time? Well, let's start with what it is and what it does.
What is TCP Offloading/Chimney?
Starting when Windows Server 2003 SP1 was the current server OS, Microsoft released the Scalable Networking Pack http://support.microsoft.com/kb/912222/en-us
This turned on in the OS, three distinct things:
1.) RSS
Where multiple
CPUs reside in a single computer, the Windows networking stack limits
"receive" protocol processing to a single CPU. In essence, the old
design of dealing with all incoming network traffic on a single
processor core was starting to cause a bottleneck on newer
multiprocessor systems. RSS resolves this issue by enabling the packets
that are received from a network adapter to be balanced across multiple
CPUs. In essence with RSS on, each incoming TCP connection is load
balanced over the available cores, spreading the load and preventing a
bottleneck from occurring. This is becoming a necessity as servers have
to handle increasingly high loads of network traffic.
2.) TCP Chimney (sometimes referred to as TCP Offloading)
This feature
is designed to take processing of the network such as packet
segmentation and reassembly processing tasks, from a computer's CPU to a
network adapter that supports TCP Chimney Offload. This has the effect
of reducing the workload on the host CPU and moving it to the NIC,
allowing both the Host OS to perform quicker and also speed up the
processing of network traffic.
3.) Network Direct Memory Access (NetDMA)
http://technet.microsoft.com/en-us/library/gg162716(WS.10).aspx
The NetDMA interface provides generic access to direct memory access (DMA) engines that can perform memory-to-memory data transfers with little CPU involvement. Again, this is designed to take work away from the CPU by allowing the NIC to move data from receive buffers without using the CPU as much.
Why would I want to disable it then?
All these
features sound brilliant, and only enabled with the installation of the
Scalable Networking Pack, so why would you want to disable it?
Well, with the
release of Service Pack 2 for Windows Server 2003, Microsoft decided to
include this scalable networking pack and thus turn the features on. If
a server has a NIC which supports these features, and it's enabled in
the NIC properties (more on this later) then we'll use them. The problem
with this approach was, that many NICs reported to the OS that they
supported these features, which they indeed did, but many didn't perform
these functions very well at all in reality. With 2003 this was an all
or nothing effort, if offloading was turned on, we offloaded all TCP
connections on a supported NIC, regardless of whether it would benefit
or not.
We had a great number of issues through
Microsoft support over the course of the next year or so which were
caused by drivers misbehaving with these features which had a knock on
effect on the network traffic. This in turn caused a wide range of weird
and wonderful symptoms seen across the board from the Exchange team to
SQL to BizTalk to IIS to ISA. Most of these landed on the lap of my
colleagues and I in the Networking support team and as such we've
probably seen issues numbering in the thousands caused by these
features, where turning them off resolved the problem, so much so that
turning off Offloading/RSS became an almost standard troubleshooting
step with 2003 cases.
Most NIC vendors have released numerous
updates over the years to resolve these issues as have Microsoft to
improve the feature in the OS, however my general advice with Server
2003 is to disable these features.
In essence, with server 2003 this feature tends to cause more
problems than it solves so I tend to recommend it is turned off. It
doesn't really become effective until you get high speed (10gbps)
networks and low latency and, as mentioned, many older NIC drivers don't
implement this well and thus we get a lot of issues. If enabled, the
bare minimum recommendation would be that the NIC and teaming drivers
and firmware are the latest available and http://support.microsoft.com/kb/950224/en-us is applied. How do I turn off these features in Server 2003?
If the features aren't needed then they should be turned off, we can do this in one of two ways.
The safest way is to do this in the registry by using method 3 in http://support.microsoft.com/kb/948496/en-us which sets the registry to configure these features to be off.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
- Right-click EnableTCPChimney, and then click Modify.
- In the Value data box, type 0, and then click OK.
- Right-click EnableRSS, and then click Modify.
- In the Value data box, type 0, and then click OK.
- Right-click EnableTCPA, and then click Modify.
- In the Value data box, type 0, and then click OK.
- Exit Registry Editor, and then restart the computer.
Alternatively you can disable the features in the NIC properties of most NICs, however the naming convention and exposure of these settings varies from NIC to NIC and also from driver to driver.
If the feature is disabled on either the NIC properties, or the registry, it's off, regardless if the other is on. This is why I recommend you use the registry as it won't be affected by driver updates etc and is much easier to control centrally.
So what about newer OS versions?
The first point of note is that Microsoft made a lot of effort in the newer OS versions to ensure the drivers were up to the job, and also a lot of improvements in the implementation of the features and TCP stack in the OS which makes the enabling of the features much safer in post 2003 OS versions.
Server 2008
In Windows Server 2008 (not R2) Offloading is turned off by default anyway. http://support.microsoft.com/kb/951037/en-us
You can enable it using the NIC properties or by using Netsh which is outlined in the link above. The offloading capabilities are more granular in 2008 than they were in 2003, we offload suitable network connections.
As before, it's always wise if using this feature that the latest NIC drivers and firmware are installed to ensure the NIC manufacturers latest updates are in place.
Ensure you have http://support.microsoft.com/kb/976035/en-us installed on top of SP2 to prevent an unexpected restart scenaro.
If the above steps are done, in my experience it's very safe to turn the feature on in 2008 if you feel it is needed.
To check the state of offloading you can run the following steps:
Run Netstat –t in a command prompt an you'll get the following output:
Active Connections
Proto Local Address Foreign Address State Offload State
TCP 127.0.0.1:52613 computer_name:52614 ESTABLISHED InHost
TCP 192.168.1.103:52614 computer_name:52613 ESTABLISHED Offloaded
InHost shows the connection is not offloaded and thus handled by the OS, Offloaded mean exactly that.
For those of you looking in memory dumps
and wondering if these features are in use, you should be able to dump
the registry keys used to set this by running x tcpip!*disable* in windbg for server 2003.
In this example, both RSS and TCP Chimney is disabled.x tcpip!*disable*
b8f1a0d4 tcpip!DisableRSS = 1
b8f1a360 tcpip!DisableUserTOSSetting = 1
b8f1df34 tcpip!DisableMediaSenseEventLog = 0
b8f1a0d0 tcpip!DisableTCPChimney = 1
b8f1ae54 tcpip!DisableTaskOffload = 0
b8f1cdc0 tcpip!DisableLargeSendOffload = 0
b8f1a0b4 tcpip!DisableIPSourceRouting = 2
b8f1ae4c tcpip!DisableMediaSense = 0
b8f1a0ec tcpip!DisableUserTOS = 1
b8f01ff3 tcpip!DisableRouter (void)
b8f0c3b0 tcpip!IPDisableMediaSenseRequest (struct _IRP *, struct _IO_STACK_LOCATION *)
b8f106d6 tcpip!OlmDisableOffloadOnInterface (unsigned int)
b8f04d4b tcpip!IPDisableChimneyOffload (struct _IRP *, struct _IO_STACK_LOCATION *)
b8f048bf tcpip!IPDisableSniffer (struct _UNICODE_STRING *)
As these structures don't exist in 2008 + you'll need to use another command which im currently trying to confirm what the best method is and I'll update the blog with the info.
Server 2008 R2
With server 2008 R2 this feature is much more intelligent, it'll only offload when the conditions are right..as per http://technet.microsoft.com/en-us/library/gg162709(WS.10).aspx
Automatic. In automatic mode, TCP Chimney Offload considers offloading the processing for a connection only if the following criteria are met: the connection is established through a 10 Gbps Ethernet adapter, the mean round trip link latency is less than 20 milliseconds, and at least 130 KB of data has been exchanged over the connection. In automatic mode, the TCP receive window is set to 16 MB. Because the Windows stack has performance optimizations not found in Chimney-capable network adapters, automatic mode restricts offloads only to those connections that might receive the most benefit from it.
This is the default setting and I'd
advise it's left as default. As always, ensure the latest NIC
drivers/Firmware is installed to remove the risk of any known issues but
in my experience taken from many thousands of customers, this feature
is a real benefit to the OS. In fact I've seen multiple customers who
have gotten into the habit of disabling these features in their OS build
following the issues they experienced with Server 2003. I've been
called out to look at performance issues and when we've re-enabled the
features we notice a massive performance improvement.
If you are getting problems which are
resolved by turning off TOE in 2008 R2, my first step would be to update
the NIC driver and firmware as there are almost always updates for the
NICs which resolve the majority of offloading issues I encounter.
If the problem persists, turning off
Offloading is the wrong thing to do, raise a case with Microsoft and
we'll help you get to the bottom of it, by having a policy of disabling
these features, you are effectively restricting your Windows platform's
network performance for the sake of one or two issues which could be
investigated and resolved.
The performance improvement on certain
connections is enormous and shouldn't be thrown away due to habit (i.e.
the 2003 behaviour) or a few issues which haven't been fully
investigated, quick fixes will eventually come round and bite you, in my
personal experience.
To manage the settings in 2008 R2 the following KB gives more information on the Netsh commands available.
http://technet.microsoft.com/en-us/library/gg162682(v=ws.10).aspxIt's also advised to install http://support.microsoft.com/kb/2477730/en-us to resolve an issue with offloading in 2008 R2, this is non urgent so could be planned into your next change window.
Server 2012
Offloading in Windows Server 2012 works much as it does in server 2008 R2 so the same advice applies. RSS however becomes more important in this OS due to the fact SMB Multichannel relies on it.
http://support.microsoft.com/kb/2846837/en-us is a recommended hotfix for RSS in server 2012.
Additional points of note around offloading:
Large send offload and checksum offload
I've seen many references on the internet
pointing to things around TCP task offloading, such as Checksum
offloading and Large Send offload being related to TCP chimney. Its
important to note, these are not related to the TCP Chinmey/offload
described above. Checksum offload is where we allow the NIC to set the
checksum on a packet when it leaves the machine (which is why Netmon and
Wireshark often show "incorrect checksum" on packets as the driver
which captures them sits above the NIC where the checksum is set). Large
Send offload (LSO) allows the Application layer to dump down a packet
which would be too big for transmission and allows the NIC to chop it up
into transmittable sizes (which is why you can see packets > 1460
bytes of payload in Netmon/Wireshark).
These can be set in the NIC properties
but are generally very very safe to leave on. You may want to disable
LSO if you're sniffing traffic as you wont be seeing the packets as they
are transmitted on the wire.
Network tracing offloaded connections:
Another reason you may want to disable TCP offloading is if you want to take a network trace. Both Netmon's filter driver and Wireshark's will show you only the three way handshake and the session tear down if offloading is being used. This is due to where the drivers sit, when offloading is used, the data bypasses these drivers so you'll only see the part of the session the OS is responsible for, the session setup and tear down.
http://blogs.technet.com/b/messageanalyzer/archive/2012/09/17/meet-the-successor-to-microsoft-network-monitor.aspx is a new tool from Microsoft which allows us to trace at different layers other than NDIS (where netmon sits) and thus may allow you to work round this issue depending on the scenario.
So for a short summary my recommendations for Offloading and RSS are:
Server OS version | RSS/Chimney On by default | Recommended setting | Methods to disable | Additional Recommendations |
2003 SP2 | Yes | Turn off unless needed | NIC properties or registry | Update NIC drivers and apply http://support.microsoft.com/kb/950224/en-us |
2008 | No | Turn on only if needed | Nic properties or Netsh | Update NIC drivers and apply http://support.microsoft.com/kb/979614/en-us http://support.microsoft.com/kb/967224/en-us if enabled |
2008 R2 | Yes (Only offloads suitable connections) | Leave enabled | NIC properties or Netsh | Update NIC drivers and apply http://support.microsoft.com/kb/2958399 http://support.microsoft.com/kb/2511305 |
2012 inc R2 | Yes (Only offloads suitable connections) | Leave enabled | Nic properties or Netsh | Update NIC drivers and apply http://support.microsoft.com/kb/2885978 |
- Server 2003: Turn it off unless absolutely needed
- Server 2008: Off by default, Turn on if needed after a NIC driver update and Windows hotfix.
- Server 2008 R2/2012: On, Automatic mode by default, leave as default and update NIC drivers if possible. Install hotfix on 2008 R2 in next change window.
*Update.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055853 is a recommended fix if you're using RSS and VMware.
http://blogs.technet.com/b/onthewire/archive/2014/01/21/tcp-offloading-chimney-amp-rss-what-is-it-and-should-i-disable-it.aspx
No comments:
Post a Comment