This report has been written due to a number of write latency related bugs I’ve witnessed in recent versions on ONTAP 7-mode. It’s a modified version of an internal report that I wrote up in some free time. I thought I’d summarise my findings in case others are seeing similar issues.
Symptoms
- High Write latency alarms being received.
- Some write latencies up to 180ms (yes, 180 milliseconds)
- CPU usage is not excessive
Diagnostics
Known bugs
There are a number of known bugs in ONTAP 7-mode that can cause this, based on experience. I’ve tried to summarise these below.
ONTAP 8.2.2P2
Bug 855574: Sequential appends to user file results in excessive write latency
Details:
- Present in ONTAP 8.2.2P2
- Introduced in 8.2 codebase (8.1 is immune)
- Fixed in ONTAP 8.2.3P3 onwards
Recommendation:
- Upgrade to ONTAP 8.2.4P6
ONTAP 8.2.3P3, possibly earlier versions too
Bug 647449: Use of default quota rules can impact I/O latency and throughput
Details:
- Present in ONTAP 8.2.3P3
- Fixed in ONTAP 8.2.3P4
Recommendation:
- Upgrade to ONTAP 8.2.4P6
ONTAP 8.2.3P3, ONTAP 8.2.3P4, ONTAP 8.2.3P6
Bug 928593: Write operations are not performed resulting in severe write latencies
Details:
- First introduced in ONTAP 8.2.3P3. Remains in subsequent 8.2.3 P-releases
- Mostly fixed in ONTAP 8.2.4. Waiting on P1 or P2 for more complete fix.
- This bug is a direct relation of 855574, so if your workloads were “tickling” that bug you may see this bug, too.
Recommendation:
- Stay on current version for now. While this bug is fixed in the recently-released 8.2.4, NetApp have asked us to hold off until 8.2.4P1 is released in Q1 2016, as they haven’t ironed out all the write performance bugs.