A Deeper Dive - Diagnosing DSE Performance Issues with Ttop and Multidump | Apache Cassandra and DataStax Enterprise - DataStax Academy

So you've read, exercised in practice, and committed to memory each tool and method in J.B. Langston's Troubleshooting DataStax Enterprise.  You're a master of identifying problem nodes and patterns, pinpointing bottlenecks, and analyzing logs and other resources to sniff out root cause of the issues you're seeing.

Now, let's say for example you've found GC pauses on a specific node behind your query latency problems.  Or, you see evidence that a compaction thread may be hung.  What now?  Usually from here you would pack up your diagnostic bundle from Opscenter and contact Datastax support, where we may ask for output from yet even more diagnostic tools to analyze the situation.  Here I'll be stepping through a few of the tools that other Datastax support members and I use to go even deeper with DSE performance issues.

Nodetool sjk ttop - The New Kid on the Block, specializing in per-thread CPU and heap allocation

Well, not *exactly* new, but newly adopted in DSE.  The sjk suite and ttop command itself have been around for some time as a standalone jar here but have been added to nodetool in DSE 5.1.   I typically turn to ttop when I suspect high GC churn (a high volume of objects being created and deallocated in the process space), usually indicated by new generation pauses such as this:

INFO  [Service Thread] 2022-08-08 19:47:25,122  GCInspector.java:258 - G1 Young Generation GC in 2320ms.  G1 Eden Space: 11559501111 -> 0; G1 Old Gen: 4929379111 -> 5521173111; G1 Survivor Space: 1845493123 -> 1610612321;

So let's say we know a spike in latency occurred at this time, and we know the JVM was paused causing brief node unresponsiveness, but what was churning through the heap so quickly that it needed 2 seconds to catch its breath?  Enter ttop output:

Example command:

nodetool sjk ttop -o CPU -n 20

Or using sjk jar:

java -jar ./sjk-plus-.jar ttop -s 127.0.01:7199 -o CPU -n 20

Output:

2022-08-08 19:47:25.111+0000 Process summary
 process cpu=102.02%
 application cpu=66.73% (user=47.74% sys=18.98%)
 other: cpu=35.29%
 thread count: 229
 heap allocation rate 438mb/s
[5803086] user= 61.18% sys= 0.77% alloc=  126mb/s - CompactionExecutor:1234
[5803087] user= 60.79% sys= 1.08% alloc=  123mb/s - CompactionExecutor:5678
[000496] user= 5.45% sys= 1.54% alloc= 5811kb/s - SharedPool-Worker-1
[000498] user= 5.08% sys= 1.51% alloc= 5691kb/s - SharedPool-Worker-2
[000500] user= 4.18% sys= 1.29% alloc= 4322kb/s - SharedPool-Worker-3
[000501] user= 3.63% sys= 1.05% alloc= 4202kb/s - SharedPool-Worker-4
...

Here we see a summary and thread-by-thread breakdown of CPU and heap allocation by thread.  In this specific sampling, we see the two threads CompactionExecutor:1234 and CompactionExecutor:5678 allocating 126mb/s and 123mb/s heap, head and shoulders above the rest.  From here we've identified a clear set of threads to put under the microscope.  To tell exactly *what* these two threads are allocating at such a rate leading to our GC trauma, we'll turn to some other tools in our box.  Which brings us to...

Thread dumps via multidump.sh - Old reliable, specializing in CPU utilization and thread profiling

Keeping our current example going, we want to know more about the two aforementioned compaction threads, so we take a thread dump as mentioned in JB's blog, around the same time as our ttop output.  The outputs from this should be a top.out file, containing typical top output by thread, and jstack output containing java thread dump information.  If not, troubleshooting steps can be found in the comments of that article.

Typical top.out will look like so, iterated over several times:

 PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
17290 cassand+  24   4  3.031t 0.202t 0.068t R 88.2 82.2 195:50.33 java
10456 cassand+  24   4  3.031t 0.202t 0.068t R 76.5 82.2 126:13.95 java
17291 cassand+  24   4  3.031t 0.202t 0.068t R 12.6 82.2 171:39.46 java
..

You can search the accompanying jstack output from iterations of the two compaction executor threads and find entries like so:

"CompactionExecutor:1234" #2183 daemon prio=1 os_prio=4 tid=0x00007f06267e5180 nid=0x438A runnable [0x00007c6736253000]
  java.lang.Thread.State: RUNNABLE
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:62)
at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:36)
at org.apache.cassandra.db.RangeTombstone$Tracker.isDeleted(RangeTombstone.java:244)
at org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced(LazilyCompactedRow.java:279)
...

Note, the PID (thread ID) from top.out can be related directly to jstack.out as well, via the "nid" parameter.  PID 17290 in this case, converted to hex, is 438A, which you can see as "nid=0x438A" above.

So in this scenario, CompactionExecutor:1234 was churning over a huge number of range tombstones, which led us to the root of the issue.  Of course I'm skipping a few steps from the above output to identifying exact root cause, but the important thing here is the ability to map symptoms to ttop and multidump/stack output.  I won't rehash JB's article here, but you can also find those thread IDs in the logs for another data point.  Note, for the example of a hung thread I mentioned before, you could simply find the last entry that thread presented in the logs, then find its output in jstack.out to see exactly where in the code it’s stuck.

I do want to mention two more tools here, but detailed examples are better reserved for their own articles.  As discussed, Ttop and multidump team up very well to diagnose issues like high cpu, young gen gc pauses, and query latency.  For high memory usage, however, I'd still recommend a heap dump loaded up in Eclipse Memory Analyzer.  And for a heavier but even deeper real-time analysis of the Cassandra process past what ttop and thread dumps can provide for thread activity, memory consumption, and general JVM activity, a Java Flight Recorder is your best bet.  But again, these warrant their own articles so stay tuned!