Classes are the building blocks of Java’s type system, but
they also serve another fundamental purpose: a class
is a compilation unit, the smallest piece of code that can be
individually loaded and run a JVM process. The class-loading
mechanism was set from the beginning of Java time, back in
JDK 1.0, and it immensely affected Java’s popularity as a crossplatform
solution. Compiled Java code—in the form of class
files and packaged JAR files—can be loaded into a running
JVM process on any of many supported operating systems.
It’s this ability that has allowed developers to easily distribute
compiled binaries of libraries. Because it is so much easier to
distribute JAR files than source code or platform-dependent
binaries, this ability has made Java popular, particularly in
open source projects.
In this article, I explain the Java class-loading mechanism
in detail and how it works. I also explain how classes are
found in the classpath and how are they loaded into memory
and initialized for use.
The Mechanics of Loading Classes into the JVM
Imagine you have a simple Java program such as the one
below:
public class A {
public static void main(String[] args) {
B b = new B();
int i = b.inc(0);
System.out.println(i);
}
}
When you compile this piece of code and run it, the JVM
correctly determines the entry point into the program and
starts running the main method of class A. However, the JVM
doesn’t load all imported classes or even referred-to classes
eagerly—that is, right away. In particular, this means that
only when the JVM encounters the bytecode instructions for
the new B() statement will it try to locate and load class B.
Besides calling a constructor of a class, there are other ways
to initiate the process of loading a class, such as accessing
a static member of the class or accessing it through the
Reflection API.
In order to actually load a class, the JVM uses classloader
objects. Every already loaded class contains a reference
to its class loader, and that class loader is used to
load all the classes referenced from that class. In the preceding
example, this means that loading class B can be
approximately translated into the following Java statement:
A.class.getClassLoader().loadClass("B").
Here comes a paradox: every class loader is itself an object of the java.lang.Classloader type that developers
can use to locate and load the classes by name. If
you’re confused by this chicken-and-egg problem
and wonder how the first class loader that
loads all the JDK classes (for example, java.lang
.String) is created, you’re thinking along the right
lines. Indeed, the primordial class loader, called
the bootstrap class loader, comes from the core
of the JVM and is written in native platformdependent
code. It loads the classes necessary
for the JVM itself, such as those of the java.lang
package, classes for Java primitives, and so forth.
Application classes are loaded using the regular,
user-defined class loaders written in Java—so, if
needed, the developer can influence the processing
of these loaders.
The Class-Loader Hierarchy
The class loaders in the JVM are organized into
a tree hierarchy, in which every class loader
has a parent. Prior to trying to locate and load a
class, a good practice for a class loader is to check
whether the class’s parent in the hierarchy can
load—or already has loaded—the required class. This helps
avoid doing double work and loading classes repeatedly. As a
rule, the classes of the parent class loader are visible to the
children but are not visible otherwise. This structure, which
is based on delegation and visibility of the classes, allows for
separation of the responsibilities of the class loaders in the
hierarchy and makes the class loaders responsible for loading
classes from a specific location only.
Let’s look at this hierarchy of class loaders in a Java application
and explore what classes they typically load. At the
root of the hierarchy, Java is the bootstrap class loader. It
loads the system classes required to run the JVM itself. You
can expect all the classes that were provided with the JDK
distribution to be loaded by this class loader.
(A developer can expand the set of classes that
the bootstrap class loader will be able to load by
using the -Xbootclasspath JVM option.)
Note that even though the library might be
put on the boot classpath, it won’t be automatically
loaded and initialized. Classes are loaded
into the JVM only on demand, so even though
classes might be available for the bootstrap class
loader, the application needs to access them to
trigger their actual loading. (A curious aspect of
this loading process is that you can override JDK
classes if your JAR file is prepended to the boot
classpath. While this is almost always a poor
idea, it does open a door to potentially morepowerful
tools.)
A sort of child of the bootstrap class loader is
the extension class loader, which loads the classes
from the extension directories (explained in a
moment). These classes may be used to specify
machine-specific configuration such as locales,
security providers, and such. The locations of
the extension directories are specified via the
java.ext.dirs system property, which on my machine is
set to the following:
/Users/shelajev/Library/Java/Extensions:/Library/
Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/
Home/jre/lib/ext:/Library/Java/Extensions:/Network/
Library/Java/Extensions:/System/Library/Java/
Extensions:/usr/lib/java
By changing the value of this property, you can change
which additional libraries are loaded into the JVM process.
Next comes the system class loader, which loads the
application classes and the classes available on the class-
path. Users can specify the classpath using the
-cp property.
Both the extension class loader and the system
class loader are of the URLClassloader type and
behave in the same way: delegating to the parent
first, and only then finding and resolving the
required classes themselves, if need dictates.
The class-loader hierarchy of web applications
is a bit more complicated. Because multiple
applications can be deployed simultaneously to an application
server, they need to be able to distinguish their classes
from each other. So, every web application uses its own class
loader, which is responsible for loading its libraries. Such
isolation ensures that different web applications deployed to
a single server can have different versions of the same library
without conflicts. So the application server automatically
provides every web application with its own class loader,
which is responsible for loading the application’s libraries.
This arrangement works because the web application class
loader will try to locate the classes packaged in the application’s
WAR file first, rather than first delegating the search to
the parent class loader.
Finding the Right Class
In general, if multiple classes with the same fully qualified
name are available to the JVM, the conflict resolution strategy
is simple and straightforward: the first appropriate class
wins. The URLClassloader, which most of the class loaders
extend from, will traverse the directories in the order they
are given on the classpath and load the first class it finds
that has requested the class name.
The same goes for JAR files that share the same name. The
JAR files will be scanned in the order in which they appear in
the classpath, not according to their names. If the first JAR
file contains an entry for the required class, the class will be
loaded. If not, the classpath scan will continue and reach the
second JAR file. Naturally, if the class isn’t
found anywhere on the classpath, the
ClassNotFound exception will be thrown.
Usually, relying on the order of directories
in the classpath is a fragile practice, so
instead the developer can add the classes to
-Xbootclasspath to ensure that they will be
loaded first. There’s nothing in particular wrong
with this approach, but maintaining a project
that relies on a polluted boot classpath requires work.
Intuition about where the classes are loaded from will be
broken, and everyone will be confused. A better practice is
to resolve the confusion at its root and figure out why there
are multiple classes with the same name on the classpath.
Maybe upgrading some dependency version, cleaning the
caches, or running a clean build will be enough to get rid of
the duplicates.
Resolution, Linking, and Verification
After a class is located and its initial in-memory representation
created in the JVM process, it is verified, prepared,
resolved, and initialized.
■■ Verification makes sure that the class is not corrupted and
is structurally correct: its runtime constant pool is valid,
the types of variables are correct, and the variables are
initialized prior to being accessed. Verification can be
turned off by supplying the -noverify option. If the JVM
process does not run potentially malicious code, strict
verification might not be required. Turning off the verification
can speed up the startup of the JVM. Another
benefit is that some classes, especially those generated
on the fly by various tools, can be valid and safe for the
JVM but unable to pass the strict verification process. In
order to use such tools, the developer should disable this
verification, which is often acceptable to do in a development
environment.
■■ Preparation of a class involves initializing its static fields to
the default values for their respective types. (After preparation,
fields of type int contain 0, references are null, and
so forth.)
■■ Resolution of a class means checking that the symbolic
references in the runtime constant pool actually point
to valid classes of the required types. The resolution of a
symbolic reference triggers loading of the referenced
class. According to the JVM specification, this resolution
process can be performed lazily, so it is deferred until the
class is used.
■■ Initialization expects a prepared and verified class. It runs
the class’s initializer. During initialization, the static fields
are initialized to whatever values are specified in the code.
The static initializer method that combines the code from
all the static initialization blocks is also run. The initialization
process should be run only once for every loaded class,
so it is synchronized, especially because the initialization
of the class can trigger the initialization of other classes
and should be performed with care to avoid deadlocks.
More detail on how the JVM performs the loading, linking,
and initializing of classes is explained in Chapter 5 of the Java
Virtual Machine Specification.
Other Considerations About Class Loaders
The class-loading model is the central piece of the dynamic
operations of the Java platform. Not only does it allow for
dynamic location and linking of classes at runtime, but
it also provides an interface for various tools to hook into
the application.
In addition, many security features rely on the class-loader
hierarchy for permission checks. For example, the famous
method sun.misc.Unsafe.getUnsafe() successfully
returns an instance of the Unsafe class if it is called from a
class that was loaded by the bootstrap class loader. Because
only system classes are returned by this loader, every library
that uses the Unsafe API must rely on the Reflection API to
read the reference from a private field.
Conclusion
When you’re developing a library or a framework, as a rule,
you don’t have to worry about any issues with class loading.
It is a dynamic process that happens at runtime, so you
rarely need to influence it. Also, modifying the class-loading
scheme rarely benefits a typical Java library.
However, if you create a system of modules or plugins that
are intended to be isolated from each other, enhancing the
class-loading scheme might be a good idea. Just remember
that custom class loaders, being a fundamental force influencing
all the classes, can introduce hard-to-spot bugs into
literally any part of your application. So take extra care when
designing your own class-loading functionality.
In this article, we looked at how the JVM loads classes into
the runtime, at the hierarchical model of class loaders Java
uses, and the hierarchy model of a typical Java application.
All in all, even if you don’t fight class-loading issues or
create plugin architectures every day, understanding class
loading helps you to understand what is happening in your
application. It also provides insight into how several Java
tools work. And it really demonstrates the benefits of keeping
your classpath clean and up to date. </article>
Source : JAVA MAGAZINE 20151112
Source : JAVA MAGAZINE 20151112
No comments:
Post a Comment