This page talks about JVM initialisation issues when using Java on our shared system due to the way Java determines how many number of parallel garbage collection threads it spawns at runtime.

The problem - parallel garbage collection in Java

When the Java Virtual Machine starts, by default it spawns a number of Garbage Collection (GC) threads, which are used for parallel GC operations (these operations are in many ways very complex and there are many different types, and the information that describes GC in details is simply overwhelming...).

In the newer Java versions (what we have installed on our shared systems), the number of such threads is calculated by this formula:

(ncpus <= 8) ? ncpus : 3 + ((ncpus * 5) / 8)

i.e. on a 512-core multiprocessor system (such as Cherax), Java will spawn 323 threads plus whatever the Java program needs. Note that with the older versions of Java, by default the number of GC threads equals to the number of processors.

Because of the creation of these many threads, once a Java program is running in a batch job, you will automatically reach the system limit and all of your other login shells will be terminated (and you won't be able to login again under the running Java program finishes or killed).

You can find out how many processes and threads are running under your account by running:
ps mU <your ident>

You will also see this error in your other login shells (before they are terminated):

The memory errors that you may see would be one of the below two:

and

Increasing the heap (such as -Xmx200m) does not help, and you will also see similar errors when you try to compile some Java code using the command javac.

The JVM cannot be created because Java tries to create too many parallel garbage collection helper threads (each of which needs a certain amount of memory and thus the required aggregated memory becomes larger than what is available on the system).

The solution

While there is currently no way to tell JVM how many processors to use, we can specify how many parallel garbage collection threads Java should use, or, if desirable, disable parallel garbage collection altogether (i.e. use serial garbage collection).

To set the number of parallel GC threads, you can use the below JVM command line option:

To disable parallel GC, so this option:

For example:

To compile Java code using javac, you will need to use the -J option to specify the above JVM options, such as:

The above JVM command line options can be specified by setting the _JAVA_OPTIONS environment variable, such as:

The only catch is that when _JAVA_OPTIONS exists, you will see the "Picked up _JAVA_OPTIONS: ..." line every time.

The default Java environment on our shared systems

On our system, loading the jdk module will automatically set the _JAVA_OPTIONS environment variable. By default, parallel garbage collection is disabled when you are using Java on the login node. In a batch job, by default, Java will only use two threads for parallel garbage collection. You can adjust this by setting the preferred options using the _JAVA_OPTIONS environment variable.

You can find out the environment setting for the JDK module by running module show jdk:

Garbage Collection Tuning

If your Java program uses large data structures, some garbage collection tuning would be desirable - http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html.