Wednesday, August 26, 2015

Java Ergonomics and JVM Flags

Java Virtual Machine can tune itself depending on the environment and this smart tuning is referred to as Ergonomics.

When tuning Java, it's important to know which values were used as default for Garbage collector, Heap Sizes, Runtime Compiler by Java Ergonomics

There are many JVM command line flags. So, how do we find out which JVM flags are used? It turns out that JVM has flags to print flags. :)

I tested following commands with Java 8.

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 15.04
Release: 15.04
Codename: vivid
$ uname -a
Linux isurup-ThinkPad-T530 3.19.0-26-generic #28-Ubuntu SMP Tue Aug 11 14:16:32 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Printing Command Line Flags

We can use "-XX:+PrintCommandLineFlags" to print the command line flags used by the JVM. This is a useful flag to see the values selected by Java Ergonomics.

$ java -XX:+PrintCommandLineFlags -version
-XX:InitialHeapSize=259964992 -XX:MaxHeapSize=4159439872 -XX:+PrintCommandLineFlags -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC 
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

My laptop is detected as a server-class machine as I'm running Ubuntu 64-bit version. I have 16GB memory.

$ free 
             total       used       free     shared    buffers     cached
Mem:      16247812   13500596    2747216    1020860     172252    7987388
-/+ buffers/cache:    5340956   10906856
Swap:     18622460      42072   18580388

So, as explained in Java Ergonomics, we can see that the initial heap size is 1/64 of physical memory and maximum heap size is 1/4 of physical memory.

$ echo $(((16247812 * 1024) / 259964992))
$ echo $(((16247812 * 1024) / 4159439872))

You can also notice that the parallel garbage collector (-XX:+UseParallelGC) is selected by the JVM.

Printing Initial JVM Flags

$ java -XX:+PrintFlagsInitial -version

When printing the JVM flags, we can see there are several columns. Those columns are type, name, assignment operator, value and the flag type.

Use this command to see the default values.

Printing Final JVM Flags

See blog post on -XX:+PrintFlagsFinal to learn more details on each column.

$ java -XX:+PrintFlagsFinal -version

When printing final flags, we can see there are some flags with assignment operator ":=", which indicates that the flag values were modified (by manually or by Java Ergonomics)

$ java -XX:+PrintFlagsFinal -version | grep ':='

JVM Flag Types

All JVM Flags are categorized in to different types, which can be seen inside curly brackets in the -XX:+PrintFlagsInitial/-XX:+PrintFlagsFinal output.

$ java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UnlockCommercialFeatures -XX:+PrintFlagsFinal -version  |  awk -F ' {2,}' '{print $4}' | sort -u 
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

{ARCH diagnostic}
{ARCH experimental}
{ARCH product}
{C1 diagnostic}
{C1 pd product}
{C1 product}
{C2 diagnostic}
{C2 experimental}
{C2 pd product}
{C2 product}
{pd product}
{product rw}

As mentioned in the blog post on -XX:+PrintFlagsFinal , we can find some details on these flag types from the JDK source file:

Following are the meanings of above flag types

  • product - General Flags for JVM, which are officially supported.
  • product rw - Writable internal product flags
  • manageable - Writable external product flags.
  • diagnostic - These flags are not meant for JVM tuning or for product modes. Can be used for JVM debugging. Need to use -XX:+UnlockDiagnosticVMOptions
  • experimental - These flags are in support of features, which are not part of officially supported product. Need to use -XX:+UnlockExperimentalVMOptions
  • commercial - These flags are related to commercial features of the JVM. Need a license to use in production. Need to use -XX:+UnlockCommercialFeatures
  • C1 - Client JIT Compiler specific flags
  • C2 - Server JIT Compiler specific flags
  • pd - Platform dependent flags
  • lp64 - Flags for 64bit JVM
  • ARCH - Architecture (CPU: x86, sparc etc) dependent flags

JVM Flag Data Types

$ java -XX:+PrintFlagsFinal -version | awk '{if (NR!=1) {print $1}}' | sort -u
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

The data types for JVM Flags are intuitive.

  • bool - Boolean (true/false)
  • ccstr - String
  • ccstrlist - Represents String arguments, which can accumulate
  • double - Double
  • intx - Integer
  • uint64_t - Unsigned long
  • uintx - Unsigned integer


If you ever wanted to find out all possible JVM flags, the -XX:+PrintFlagsFinal flag is the solution for you. Java Ergonomics select the best values depending on the environment and it's important to be aware of the Java Ergonomics when tuning your application.

Saturday, August 22, 2015

Finding how many processors

I wanted to find out the processor details in my laptop and I found out that there are several ways to check. For example, see The RedHat community discussion on Figuring out CPUs and Sockets.

In this blog post, I'm listing few commands to find out details about CPUs.

I'm using Ubuntu in my Lenovo ThinkPad T530 laptop and following commands should be working any Linux system.

Display information about CPU architecture

$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Stepping:              9
CPU MHz:               1199.988
CPU max MHz:           3600.0000
CPU min MHz:           1200.0000
BogoMIPS:              5787.10
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              4096K
NUMA node0 CPU(s):     0-3

$ lscpu -e
0   0    0      0    0:0:0:0       yes    3600.0000 1200.0000
1   0    0      0    0:0:0:0       yes    3600.0000 1200.0000
2   0    0      1    1:1:1:0       yes    3600.0000 1200.0000
3   0    0      1    1:1:1:0       yes    3600.0000 1200.0000

The lscpu command reads /proc/cpuinfo file to get the CPU architecture details. So, you can also read it get more info.

$ cat /proc/cpuinfo

The lshw (list hardware) is another command to get CPU details.

$ sudo lshw -class processor

There is a GUI tool named "hardinfo", which can show details about various hardware components.

$ sudo apt-get install hardinfo
$ hardinfo

The dmidecode is another tool to find out hardware details. Following is an example to find processor details. The value "4" is the Dmi Type for the processor.

$ sudo dmidecode -t 4

The cpuid tool can dump CPUID information for each CPU.

$ sudo apt-get install cpuid
$ cpuid

The inxi is also another tool to check hardware information on Linux.

$ sudo apt-get install inxi
# Show full CPU output, including per CPU clockspeed and CPU max speed
$ inxi -C

Finding number of processors

# Print the number of processing units available
$ nproc

$ cat /proc/cpuinfo | grep processor | wc -l

Finding number of Physical CPUs

$ cat /proc/cpuinfo | grep "physical id" | sort -u | wc -l

Finding number of cores per socket

$ lscpu | grep 'socket'
Core(s) per socket:    2

$ cat /proc/cpuinfo | grep "cpu cores" | uniq
cpu cores : 2

Finding number of threads per core

$ lscpu | grep -i thread
Thread(s) per core:    2


It's important to know about the CPUs when we are doing performance tests. A physical CPU is inserted into a single CPU socket. There should be multiple CPU sockets to support multiple CPUs. However modern CPUs support multiple cores and hyper-threading. See: CPU Basics: Multiple CPUs, Cores, and Hyper-Threading Explained

We can use following formula to calculate the number of logical processors we see in our system.

Number of Logical Processors = Number of Sockets x Number of Cores per CPU x Threads per Core

In my Laptop, there is one physical processor with two cores and each core has two threads. So, I have 4 logical processors. :)

Monday, August 10, 2015

Linux Performance Observability Tools

I am learning about Linux Performance Tools and I found Brendan Gregg's talks on Linux Performance are very interesting.

There are so many performance tools for Linux. Brendan recommends to follow a performance analysis methodology to analyze system or application performance. These methodologies can guide us to choose and use these performance tools effectively.

Linux Performance Observability Tools

There are different types of command line tools available in Linux. In this blog post, I'm going to focus on Linux Performance Observability Tools. I highly recommend to watch Brendan's talk at Velocity 2015 on Linux Performance Tools and I took details about following tools from his presentation and his website.
Linux Performance Observability Tools
Taken from Brendan Gregg's Website:

Here are some examples of using Linux Performance Observability Tools in Ubuntu.  I tested each of these commands in a Ubuntu Trusty Vagrant Box

Basic Observability Tools

# Print load averages

# System and per-process interval summary. 
# It's important to note that the top can miss short-lived processes. 
# See 30 Linux TOP Command Examples With Screenshots

# htop is an interactive process viewer and you need to install it.
sudo apt-get install htop

# Process status listing
ps -ef
# ASCII art forest
ps -ef f

# Virtual Memory Statistics
# Show stats in Megabytes and update every second
vmstat -SM 1

# Block I/O (disk) stats. The iostat tool is in 'sysstat' package.
sudo apt-get install sysstat
# Display extended statistics in megabytes per seconds. 
# This also shows device utilization and omits the inactive devices
# during the sample period.
iostat -xmdz 1

# Report multi-processor statistics.
# per-CPU stats
mpstat -P ALL 1

# Main memory usage in megabytes.
free -m

Intermediate Observability Tools

# Trace system calls and signals. This is not recommended in production.
# Trace system calls in a process. 
# Prints the time (us) since epoch (-ttt) and syscall time (-T).
# Need to use sudo to attach to the process.
sudo strace -tttT -p 3344

# Sniff network packets for post analysis. 
# Using sudo to get permissions to capture packets on device
sudo tcpdump -i eth0 -w /tmp/out.tcpdump
# Read the dump
sudo tcpdump -nr /tmp/out.tcpdump

# Print network connections. See 10 basic examples of linux netstat command
# Network statistics by protocol
netstat -s
# Show both listening and not-listening sockets
netstat -a
# Show listening sockets
netstat -l
# Show the PID (-p) and the name of the program for TCP sockets (-t)
netstat -tp
# Kernel IP routing table
netstat -r
# Kernel Interface table
netstat -i

# Print network traffic statistics. You need to install 'nicstat' package.
sudo apt-get install nicstat

# Process Stats
# Process Stats by thread
pidstat -t
# Process Stats by disk I/O
pidstat -d

# Show swap usage
swapon -s
# Show swap usage in verbose mode
swapon -v

# List open file. Can be used as a debug tool
# Show active network connections

# System Activity Reporter. 
# Before using sar, we need to enable data collection.
# See:
# Simple steps to install and configure sysstat/sar on Ubuntu/Debian server
# 10 Useful Sar (Sysstat) Examples for UNIX / Linux Performance Monitoring
sar -q
Linux Performance Observability: sar
Taken from Brendan Gregg's Website:

Advanced Observability Tools

# Socket Statistics. This is similar to netstat,
# but it can display more TCP and state informations than other tools.
# Socket Statistics. Show timer, processes and memory
ss -mop
# Show internal TCP information
ss -i

# Interactive Colorful IP LAN Monitor
sudo apt-get install iptraf
sudo iptraf

# Monitor I/O. A top-like tool
sudo apt-get install iotop
sudo iotop

# Kernel slab allocator memory usage
sudo slabtop

# Page cache statistics. 
# This tools is available in GitHub:
# and we need to use Go to build it.
# You can also download a binary from GitHub. Refer README for more information.
./pcstat testfile

# perf_events: Linux profiling with performance counters.
# This tool needs to be installed.
# I used perf command in previous blog post about Java CPU Flame Graphs. 
# See Brendan's Linux Perf Examples.
# Perf Tutorial is also good resource to learn about perf.
sudo apt-get install linux-tools-common linux-tools-generic
# List perf event
sudo perf list

# tiptop: reads hardware performance counters
# and displays statistics about running processes, such as IPC, or cache misses. 
# This tool was not available in Ubuntu 14.04 package repositories.
# Therefore I tried it on Ubuntu 15.04
sudo apt-get install tiptop

# The rdmsr command reads a Model-Specific Register
# (MSR) value from the specified address.
# This tools is available from "msr-tools" package.
sudo apt-get install msr-tools
# Brendan has developed some Model Specific Register (MSR) 
# tools for Xen guests (eg, AWS EC2).
# Reading CPU temperature:
sudo rdmsr -p1 -f 23:16 -d 0x1a2


This blog post lists some Linux Performance Observability Tools. I have also linked man pages and some examples of using the tools.

As I mentioned, Brendan's presentations have more details on these tools and I just wanted to list those in one page for my own reference.