This report has been written due to a number of write latency related bugs I’ve witnessed in recent versions on ONTAP 7-mode. It’s a modified version of an internal report that I wrote up in some free time. I thought I’d summarise my findings in case others are seeing similar issues.
Symptoms
- High Write latency alarms being received.
- Some write latencies up to 180ms (yes, 180 milliseconds)
- CPU usage is not excessive
Diagnostics
- Â High write latency is sometimes associated with high CPU usage, but in this case CPU usage is <100% across all cores (priv set diag; sysstat -M)
- IOPS going through the system is not abnormal (as in, it’s done a similar amount of IOPS, or more, in the past)
- Network throughput on the system is not maxed out
- NetApp perfstat analysis (at first level support) comes back as “system is being pushed to its limit, it’s probably time to upgrade” or “everything seems fine”. If you get this response, ask them to check the signature against the bugs below.
Known bugs
There are a number of known bugs in ONTAP 7-mode that can cause this, based on experience. I’ve tried to summarise these below.
ONTAP 8.2.2P2
Bug 855574: Sequential appends to user file results in excessive write latency
Details:
- Present in ONTAP 8.2.2P2
- Introduced in 8.2 codebase (8.1 is immune)
- Fixed in ONTAP 8.2.3P3 onwards
Recommendation:
- Upgrade to ONTAP 8.2.4P6
ONTAP 8.2.3P3, possibly earlier versions too
Bug 647449: Use of default quota rules can impact I/O latency and throughput
Details:
- Present in ONTAP 8.2.3P3
- Fixed in ONTAP 8.2.3P4
Recommendation:
- Upgrade to ONTAP 8.2.4P6
ONTAP 8.2.3P3, ONTAP 8.2.3P4, ONTAP 8.2.3P6
Bug 928593: Write operations are not performed resulting in severe write latencies
Details:
- First introduced in ONTAP 8.2.3P3. Remains in subsequent 8.2.3 P-releases
- Mostly fixed in ONTAP 8.2.4. Waiting on P1 or P2 for more complete fix.
- This bug is a direct relation of 855574, so if your workloads were “tickling” that bug you may see this bug, too.
Recommendation:
- Stay on current version for now. While this bug is fixed in the recently-released 8.2.4, NetApp have asked us to hold off until 8.2.4P1 is released in Q1 2016, as they haven’t ironed out all the write performance bugs.
4 replies on “NetApp ONTAP 7-mode: High Write Latency alarms but normal CPU usage, IOPS and network bandwidth”
Hi,
8.2.4P1 is out since January 14th.
I saw nothing about performance issues in fixed bugs report (http://mysupport.netapp.com/NOW/download/software/ontap/8.2.4P1/)
Do you have more information? Should we wait until 8.2.4P2 ?
Thanks for sharing 🙂
Hi Fab, I re-read correspondence with NetApp and they alluded to the bugs being “internal”. Not sure what that means, but it’s interesting that there are no performance related fixes in 8.2.4P1 (besides the ones fixed in 8.2.4). I’d recommend you get in touch with NetApp to get some clarification. I’ve also asked, and if I hear back, will update.
Hi again,
8.2.4P2 is out since February 18th. Again, nothing about performance issues…
Thanks for the feedback. 🙂
Hi Phil,
Just noticed that 8.2.4P3 is out since March 31th, fixing several “Append write workloads over CIFS might not perform well” bugs !!!
I guess these fixes are related to the problems detailed on this post. 🙂
http://mysupport.netapp.com/NOW/download/software/ontap/8.2.4P3/
Hope bugs are definitively fixed…
Cheers