Skip to content

Instantly share code, notes, and snippets.

@bossiernesto
Last active August 29, 2024 05:11
Show Gist options
  • Save bossiernesto/ccb3a847e83ae0ddf7db0b0eae30870f to your computer and use it in GitHub Desktop.
Save bossiernesto/ccb3a847e83ae0ddf7db0b0eae30870f to your computer and use it in GitHub Desktop.

Intro

Every Java developer knows that bytecode will be executed by JRE (Java Runtime Environment). But many doesn't know the fact that JRE is the implementation of Java Virtual Machine (JVM), which analyzes the bytecode, interprets the code, and executes it. It is very important as a developer that we should know the Architecture of the JVM, as it enables us to write code more efficiently. In this article, we will learn more deeply about the JVM architecture in Java and the different components of the JVM.

What is the JVM?

A Virtual Machine is a software implementation of a physical machine. Java was developed with the concept of WORA (Write Once Run Anywhere), which runs on a VM. The compiler compiles the Java file into a Java .class file, then that .class file is input into the JVM, which Loads and executes the class file. Below is a diagram of the Architecture of the JVM.

Image of JVM

How Does the JVM Work?

As shown in the above architecture diagram, the JVM is divided into three main subsystems:

  • Class Loader Subsystem
  • Runtime Data Area
  • Execution Engine

1. Class Loader Subsystem

Java's dynamic class loading functionality is handled by the class loader subsystem. It loads, links. and initializes the class file when it refers to a class for the first time at runtime, not compile time.

1.1 Loading

Classes will be loaded by this component. Boot Strap class Loader, Extension class Loader, and Application class Loader are the three class loader which will help in achieving it.

  • Boot Strap ClassLoader – Responsible for loading classes from the bootstrap classpath, nothing but rt.jar. Highest priority will be given to this loader.
  • Extension ClassLoader – Responsible for loading classes which are inside ext folder (jre\lib).
  • Application ClassLoader –Responsible for loading Application Level Classpath, path mentioned Environment Variable etc.

The above Class Loaders will follow Delegation Hierarchy Algorithm while loading the class files.

1.2 Linking

  • Verify – Bytecode verifier will verify whether the generated bytecode is proper or not if verification fails we will get the verification error.
  • Prepare – For all static variables memory will be allocated and assigned with default values.
  • Resolve – All symbolic memory references are replaced with the original references from Method Area.

1.3 Initialization

This is the final phase of Class Loading, here all static variables will be assigned with the original values, and the static block will be executed.

2. Runtime Data Area

The Runtime Data Area is divided into 5 major components:

  • Method Area – All the class level data will be stored here, including static variables. There is only one method area per JVM, and it is a shared resource.
  • Heap Area – All the Objects and their corresponding instance variables and arrays will be stored here. There is also one Heap Area per JVM. Since the Method and Heap areas share memory for multiple threads, the data stored is not thread safe.
  • Stack Area – For every thread, a separate runtime stack will be created. For every method call, one entry will be made in the stack memory which is called as Stack Frame. All - local variables will be created in the stack memory. The stack area is thread safe since it is not a shared resource. The Stack Frame is divided into three subentities:
  • Local Variable Array – Related to the method how many local variables are involved and the corresponding values will be stored here.
  • Operand stack – If any intermediate operation is required to perform, operand stack acts as runtime workspace to perform the operation.
  • Frame data – All symbols corresponding to the method is stored here. In the case of any exception, the catch block information will be maintained in the frame data.
  • PC Registers – Each thread will have separate PC Registers, to hold the address of current executing instruction once the instruction is executed the PC register will be updated with the next instruction.
  • Native Method stacks – Native Method Stack holds native method information. For every thread, a separate native method stack will be created.

3. Execution Engine

The bytecode which is assigned to the Runtime Data Area will be executed by the Execution Engine. The Execution Engine reads the bytecode and executes it piece by piece.

  • Interpreter – The interpreter interprets the bytecode faster, but executes slowly. The disadvantage of the interpreter is that when one method is called multiple times, every time a new interpretation is required.
  • JIT Compiler – The JIT Compiler neutralizes the disadvantage of the interpreter. The Execution Engine will be using the help of the interpreter in converting byte code, but when it finds repeated code it uses the JIT compiler, which compiles the entire bytecode and changes it to native code. This native code will be used directly for repeated method calls, which improve the performance of the system. Intermediate Code generator – Produces intermediate code
  • Code Optimizer – Responsible for optimizing the intermediate code generated above
  • Target Code Generator – Responsible for Generating Machine Code or Native Code
  • Profiler – A special component, responsible for finding hotspots, i.e. whether the method is called multiple times or not.
  • Garbage Collector: Collects and removes unreferenced objects. Garbage Collection can be triggered by calling "System.gc()", but the execution is not guaranteed. Garbage collection of the JVM collects the objects that are created.
  • Java Native Interface (JNI): JNI will be interacting with the Native Method Libraries and provides the Native Libraries required for the Execution Engine.
  • Native Method Libraries:It is a collection of the Native Libraries which is required for the Execution Engine.

Unified Types

In contrast to Java, all values in Scala are objects (including numerical values and functions). Since Scala is class-based, all values are instances of a class. The diagram below illustrates the class hierarchy.

Image of JVM

The superclass of all classes scala.Any has two direct subclasses scala.AnyVal and scala.AnyRef representing two different class worlds: value classes and reference classes. All value classes are predefined; they correspond to the primitive types of Java-like languages. All other classes define reference types. User-defined classes define reference types by default; i.e. they always (indirectly) subclass scala.AnyRef. Every user-defined class in Scala implicitly extends the trait scala.ScalaObject. Classes from the infrastructure on which Scala is running (e.g. the Java runtime environment) do not extend scala.ScalaObject. If Scala is used in the context of a Java runtime environment, then scala.AnyRef corresponds to java.lang.Object. Please note that the diagram above also shows implicit conversions called views between the value classs. Here is an example that demonstrates that both numbers, characters, boolean values, and functions are objects just like every other object:

object UnifiedTypes {
  def main(args: Array[String]) {
    val set = new scala.collection.mutable.HashSet[Any]
    set += "This is a string"  // add a string
    set += 732                 // add a number
    set += 'c'                 // add a character
    set += true                // add a boolean value
    set += main _              // add the main function
    val iter: Iterator[Any] = set.elements
    while (iter.hasNext) {
      println(iter.next.toString())
    }
  }
}

The program declares an application UnifiedTypes in form of a top-level singleton object with a main method. Themain method defines a local variable set which refers to an instance of class HashSet[Any]. The program adds various elements to this set. The elements have to conform to the declared set element type Any. In the end, string representations of all elements are printed out. Here is the output of the program:

c
true
<function>
732
This is a string
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment