Tuesday, June 10, 2014

Slow NFS due to FIXEDMTU After LiveUpgrae

Last days we were facing issue on performance issue with NFS  share after live upgrade . Later we noticed it was due to the FIXEDMTU feature on interface . 
FIXEDMTU was  not due to Live Upgrade . It was due to the mtu parameter on hostname.interface file which reflected on reboot.


There is no reason to set a MTU value of 1500, the default mtu for Ethernet is already 1500

example:
./hostname.e1000g3
auksvaqs-prod mtu 1500

./hostname.e1000g2
auksvaqs-nas mtu 1500


Please make  sure LSO is disabled over reboot ..

A side note regarding LSO , which could cause performance degradation over network . 
Large Send Offload and Network Performance
One issue that I continually see reported by customers is slow network performance.  Although there are literally a ton of issues that can affect how fast data moves to and from a server, there is one fix I’ve found that will resolve this 99% of time — disable Large Send Offload on the Ethernet adapter.
So what is Large Send Offload (also known as Large Segmentation Offload, and LSO for short)?  It’s a feature on modern Ethernet adapters that allows the TCP\IP network stack to build a large TCP message of up to 64KB in length before sending to the Ethernet adapter.  Then the hardware on the Ethernet adapter — what I’ll call the LSO engine — segments it into smaller data packets (known as “frames” in Ethernet terminology) that can be sent over the wire. This is up to 1500 bytes for standard Ethernet frames and up to 9000 bytes for jumbo Ethernet frames.  In return, this frees up the server CPU from having to handle segmenting large TCP messages into smaller packets that will fit inside the supported frame size.  Which means better overall server performance.  Sounds like a good deal.  What could possibly go wrong?
Quite a lot, as it turns out.  In order for this to work, the other network devices — the Ethernet switches through which all traffic flows — all have to agree on the frame size.  The server cannot send frames that are larger than the Maximum Transmission Unit (MTU) supported by the switches.  And this is where everything can, and often does, fall apart.
The server can discover the MTU by asking the switch for the frame size, but there is no way for the server to pass this along to the Ethernet adapter.  The LSO engine doesn’t have ability to use a dynamic frame size.  It simply uses the default standard value of 1500 bytes, or if jumbo frames are enabled, the size of the jumbo frame configured for the adapter.  (Because the maximum size of a jumbo frame can vary between different switches, most adapters allow you to set or select a value.)  So what happens if the LSO engine sends a frame larger than the switch supports?  The switch silently drops the frame.  And this is where a performance enhancement feature becomes a performance degradation nightmare.
1.    With LSO enabled, the TCP/IP network stack on the server builds a large TCP message.
2.    The server sends the large TCP message to the Ethernet adapter to be segmented by its LSO engine for the network.  Because the LSO engine cannot discover the MTU supported by the switch, it uses a standard default value.
3.    The LSO engine sends each of the frame segments that make up the large TCP message to the switch.
4.    The switch receives the frame segments, but because LSO sent frames larger than the MTU, they are silently discarded.
5.    On the server that is waiting to receive the TCP message, the timeout clock reaches zero when no data is received and it sends back a request to retransmit the data.  Although the timeout is very short in human terms, it rather long in computer terms.
6.    The sending server receives the retransmission request and rebuilds the TCP message.  But because this is a retransmission request, the server does not send the TCP message to the Ethernet adapter to be segmented.  Instead, it handles the segmentation process itself.  This appears to be designed to overcome failures caused by the offloading hardware on the adapter.
7.    The switch receives the retransmission frames from the server, which are the proper size because the server is able to discover the MTU, and forwards them on to the router.
8.    The other server finally receives the TCP message intact.
This can basically be summed up as offload data, segment data, discard data, wait for timeout, request retransmission, segment retransmission data, resend data.  The big delay is waiting for the timeout clock on the receiving server to reach zero.  And the whole process is repeated the very next time a large TCP message is sent.  So is it any wonder that this can cause severe network performance issues. This is by no means an issue that effects only Peer 1.  Google is littered with artices by major vendors of both hardware and software telling their customers to turn off Large Send Offload.  Nor is it specific to one operating system.  It effects both Linux and Windows.
____________________________________________________________________





the issue, slow copy on NFS, was solved

I set following in /kernel/drv/e1000g.conf...

tx_hcksum_enable=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;
    # this parameter disables hardware checksum creation
lso_enable=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0;
    # this paramter disable LSO feature in driver

...and rebooted the server


With this, the LSO feature is completely disabled in the e1000g driver.

We already disabled this feature via a ndd parameter but obviousley this was not enough and the dirver still affected the transfer!

No comments:

Post a Comment