JVM Settings

DSE Version: 6.0


Apache Cassandra is a Java application, hence it runs on the Java Virtual Machine, or JVM. There for it is important to know how to configure the JVM. You will learn about that in this unit.


Any application you execute runs on top of layers of architecture. Apache Cassandra is a java application, hence it runs on the Java virtual machine, or better known affectionately as the beloved JVM. The JVM runs on top of your OS, which further runs on top of your hardware.

So how many overall layers does this give us? Welllll, there's the application, which is Apache Cassandra. Most configuration you'll do for Cassandra is in the cassandra.yaml file. You tune the JVM in the JVM.options file. Depending on your operating system, you may have several possibilities there as well. And last, but probably most fun, is upgrading your hardware.

Most likely the first tuning parameter you'll work with deals with how much memory you give Apache Cassandra. The JVM divides memory into logical sections: One for code, one for the stack, and most notable is the transient area we know as the heap. The heap is where most of the action happens. As Cassandra serves requests, compacts data, and pretty much does everything it does, it mostly utilizes the heap. As Apache Cassandra releases memory it has used, the garbage collector reclaims all of this space for future use. 

The jvm.options file is where you tune how your virtual machine behaves. It's basically a set of default command line options for when DataStax Enterprise begins execution. When you start DSE, the cassandra-env.sh file includes the options for you. Obviously, don't tweak these values if you are unfamiliar with what they do. The default values are a safe place to start if you are unsure of what values you should use.

MAX_HEAP_SIZE determines how large your JVM heap will be. 8GB is a good baseline. Be careful not to tune this value up too high as the OS still needs room to operate as well. One disadvantage to large heaps is garbage collection pauses. The larger the heap, the less often the garbage collector has to kick in. But when it kicks in, it kicks in, and will take much longer to complete depending on the type of garbage collector you are using. 

HEAP_NEW_SIZE determines how much of the heap Apache Cassandra will use for creating new objects. We'll get into this more soon, but for now, think of this area for the new generational objects, that is, the young and hip crowd. Don't worry, usually new objects are collected sooner rather than later because they are generally temporary instances. By default, you got 100 megabytes per core for this new area. 

Again, don't make your heap too large. The OS needs room to operate, and large heaps lead to large GC pauses. 8 gigabytes is a good starting point.

Java 9 recently change the default garbage collector algorithm to G1, which stands for garbage first. We'll get into the details shortly. The previous collector was concurrent mark and sweep, or cms for short. The main difference is that G1 breaks the heap into smaller sections and targets the sections that may have the most garbage in them. We'll get into this more soon.

JMX is a Java technology that allows you to communicate and manage your java application. Think of it as a way to hook into your Java application and tweak some runtime values.

Here are a few of the settings you can tune with JMX. Take a moment and pause the video to look them over. Perhaps go and get a snack while you are at it.

No write up.
No Exercises.
No FAQs.
No resources.
Comments are closed.