|
|
JavaTM I/O Performance Tuning
Authors: |
Daniel Lord
Technology Manager, JavaTM and Developer Products
Sun Microsystems, Inc.
Achut Reddy
Staff Engineer
Sun Microsystems, Inc.
|
White Paper
Also available in postscript
Table of Contents:
-
-
Many JavaTM programs that utilize I/O are excellent candidates
for performance
tuning. One of the more common problems in Java applications is inefficient
I/O. A profile of Java applications and applets that handle significant
volumes of data will show significant time spent in I/O routines, implying
substantial gains can be had from I/O performance tuning. In fact, the
I/O performance issues usually overshadow all other performance issues,
making them the first area to concentrate on when tuning performance. Therefore,
I/O efficiency should be a high priority for developers looking to optimally
increase performance. Unfortunately, optimal reading and writing can be challenging
in Java. This white paper will assist developers in overcoming that challenge.
Once an application's reliance upon I/O is established and I/O is determined
to account for a substantial slice of the applications execution time,
performance tuning can be undertaken. The best method for determining the
distribution of execution time among methods is to use a profiler. SunTM
JavaTM WorkShopTM software provides an excellent profile that offers detailed call
counts and execution times for each method. System method call statistics
can be tabulated as an option. Stream chaining and custom I/O class methods
of performance tuning are discussed. An example program is provided that
allows the progressive measurement of the progress of the tuning effort.
Using the example program that is provided, JavaIOTest.java, and utilizing
the techniques described, substantial performance improvements of an order
of magnitude can be achieved. Simple stream chaining provides approximately
a 91% decrease in execution time from 28,198 milliseconds to 2,510 milliseconds, while a custom BufferedFileReader class cuts performance time by another
75%, over 97% total, to 630 milliseconds for a 250 kilobyte text
file on The SunTM SolarisTM 2.6 operating environment.
-
-
Java performance is currently a topic of great interest. Performance
is usually hotly debated for any relatively new language or operating environment,
so this is not surprising. However, Java's reliance upon the availability
of sufficient network bandwidth for the downloading of classes shifts the
relative benefits of some options for optimization. The reliance on the
network penalizes optimization techniques that favor increasing code size
in order to provide faster execution. The resulting optimized classes can
take longer to download to the client. Of course, server-side Java is not
as acutely affected by code size and developers can even consider native
code compilers for that case. Based upon anecdotal evidence, most Java
development today seems to be concentrated on client-side applets with
the result that download times are an important criterion. Java optimization
efforts, therefore, need to be well-researched and considered.
Because Java is a relatively new language, optimizing compiler features
are less sophisticated that those available for C and C++, leaving room
for more "hand-crafting". The "hand" optimization of key sections identified
by profilers such as the profiler available in Sun's Java WorkShop
2.0 can reap substantial benefits.
One of the more common problems in Java applications is inefficient
I/O. A profile of Java applications and applets that handle significant
volumes of data will show significant time spent in I/O routines, implying
substantial gains can be had from I/O performance tuning. In fact, the
I/O performance issues, usually overshadow all other performance issues
making them the first area to concentrate on when tuning performance. Therefore,
I/O efficiency should be a high priority for developers looking to optimally
increase performance. Unfortunately, optimal reading and writing can be challenging
in Java. Streamlining the use of I/O often results in greater performance
gains than all other possible optimizations combined. It is not uncommon
to see a speed improvement of at least an order of magnitude using efficient
I/O techniques as this paper and the example program will demonstrate.
This white paper focuses on the improvement gains possible through
careful use of both the existing Java I/O classes and the introduction
of a custom file reader, BufferedFileReader. BufferedFileReader is responsible
for some of the performance increase of Java WorkShop version 2.0 over
version 1.0. An example application is used to read three different file
sizes, ranging from 100 kilobytes to 500 kilobytes and the results are
compared for various optimizations. The source for the example used in
this white paper is available along with this paper.
-
-
As a demonstration of I/O performance tuning, this paper will describe
the process of tuning a sample program created expressly for this paper:
JavaIOTest. JavaIOTest tracks the execution times for several I/O schemes
starting with a very basic DataInputStream method and culminating with
the use of a custom-buffered, file-reader class, while demonstrating the
performance improvements obtained by several program design changes during
the tuning effort. The actual execution times discussed and detailed in
Appendix One are meant to show the relative improvements possible. The
actual execution times will vary widely among the systems used. Readers
are cautioned that important is the relative improvement on the same system,
test-to-test, and that comparisons across operating environments and systems
are complex and the results can be specious.
Basic IO: DataInputStream
The I/O method used in this section is a DataInputStream chained to a FileInputStream
as shown in Figure 1. This method of reading a file is very common since
it is simple, but it is extremely slow. The reason for the poor
performance is that the DataInputStream class does no buffering. The resulting
reads are done one byte at a time. Several instances of this technique
have been found in the JDKTM software as well as several
"real" Java programs, providing fertile ground for improvement through a tuning regime.
try {
DataInputStream in =
new DataInputStream(new FileInputStream(args[i]));
while ((line = in.readLine()) != null) {
nlines++;
}
in.close();
} catch (Exception e) {
System.out.println(" DISBISTest: exception:" + e );
}
Figure 1. DataInputStream.readLine()
The results of using the default, basic I/O scheme are as follows.
The first section of the example program, JavaIOTest, showed run times
of 28,198 milliseconds reading a 250 kilobyte file. The full test
results are illustrated in Appendix One.
An Improvement: BufferedInputStream
A simple improvement involves buffering the FileInputStream by interposing
a BufferedInputStream in
the stream chain. This buffers the data, with the default buffer size
of 2048 bytes. Figure 2 illustrates the
minor source code change required.
try {
FileInputStream fs = new FileInputStream(args[i]);
DataInputStream in = new DataInputStream(new
BufferedInputStream(fs));
while ((line = in.readLine()) != null) {
nlines++;
}
in.close();
} catch (Exception e) {
System.out.println(" DISBISTest: exception:" + e );
}
Figure 2. BufferedInputStream data buffering
The resulting performance increase for the medium sized file (250
kilobytes) was 91%, from 28,198 milliseconds to 2,510 -- over an order
of magnitude with just a simple change. The full results are
summarized in the tables in Appendix One.
The New JDK 1.1 Classes
The foregoing method has provided a substantial performance improvement
but has a serious flaw:
the readLine() method of DataInputStream does not properly handle Unicode
characters. The problem
is that the method assumes all characters are one byte in length while
Unicode characters are two bytes
in length. This method has been deprecated beginning in JDK 1.1. Since
deprecated classes are discouraged,
the FileReader and BufferedReader classes should be substituted for
the classes. Unfortunately, the scheme
to provide for Unicode character localization consists of invoking
a locale-dependent converter on the raw
bytes to convert them to Java characters, causing an extra copy operation
per character. This penalty is
offset by other efficiencies in the code. The code change is shown
in Figure 3.
try {
BufferedReader in = new BufferedReader(new FileReader(args[i]), 8192);
while ((line = in.readLine()) != null) {
nlines++;
}
in.close();
} catch (Exception e) {
System.out.println(" BR8192Test: exception:" + e );
}
Figure 3. BufferedReader using default buffer size
The resulting performance increase for the medium file size was 57%,
from 2,510 to 1,092 milliseconds. The full results are summarized in the
tables in Appendix One.
Buffer Size Effects
The buffer size used in buffering schemes is important for performance.
As a rule of thumb, bigger is better to a point. In order to examine the
impact of the buffer size, a test run was made with a smaller buffer than
the default of 8,192 bytes used in the BufferedReader class. Figure 4 shows
the code segment using a reduced buffer size of 1,024 bytes.
try {
BufferedReader in = new BufferedReader(new FileReader(args[i]),
1024);
while ((line = in.readLine()) != null) {
nlines++;
}
in.close();
} catch (Exception e) {
System.out.println(" BR1024Test: exception:" + e );
}
Figure 4. BufferedReader using small (1024 byte) buffer size
Depending upon the files size and platform used for testing,
the larger buffer size provided performance improvements ranging from 3%
to 13%. The use of a large buffer sizes will improve performance significantly
and shoud be considered unless local memory is restricted.
Summary
Using simple stream-chaining techniques, the execution performance of an
I/O bound Java program has been increased an average of 97% over using
the simple DataInputStream class. A substantial improvement for a little
extra design work and one that could mean the difference between shipping
and re-designing an interactive application.
-
Tuning with Custom I/O Classes
-
To this point, tuning has focused on using the core classes distributed
with the JDK. With each version of the JDK, more effort seems to be going
into tuning critical sections for performance. The improvement in speed
of the BufferedReader class over the BufferedInputStream class despite
the additional copy per character hints at this. However, if the application
needs to read large files, a custom class can be created to further tune
performance. The BufferedReader.readLine() method creates an instance of
StringBuffer to hold the characters in the line it reads. It then converts
the StringBuffer to String, resulting in two more copies per character.
The BufferedFileReader class utilizes a modified readLine() method that
avoids the extra, double-copy in most cases. It also adds the convenience
of creating the FileReader class for the caller. Figure 5 shows the changes
required to use this class. The
resulting performance increase for the medium file size was 32% overall
to 630 milliseconds. The full results are summarized in the tables in Appendix
One.
BufferedFileReader in = new BufferedFileReader();
for (int i=0; i < nargs; i++) {
try {
in.open(args[i]);
while ((line = in.readLine())
!= null) {
nlines++;
} in.close();
} catch (Exception e) {
System.out.println(" BFRTest: exception:" +
e );
}
Figure 5. BufferedFileReader
The BufferedFileReader class is being used in Java WorkShop (package
sun.jws.util). The documentation comment in Figure 6 describes the efficiencies
added.
Provides
a single, efficient class through which a file may be read, without having to chain
together several different classes, as with the standard JDK classes.
It is also more efficient (typically faster) than
the fastest JDK classes. Specific optimizations include:
- More efficiently coded
readLine( ) method. Avoids
- adds open( ) method,
so the class can be reused when several files
are read in a loop. This avoids repeated allocation and
deallocation of buffers.
Performs the correct byte-to-char
conversion, so that it works
properly unicode for and i18n.
Example 1 - single file:
eBufferedFileReader in = new BufferedFileReader("Foo.java");
while ((line = in.readLine()) != null) {
...
}
Example 2 - multiple files:
BufferedFileReader in = new BufferedFileReader();
int n = files.size();
for (int i=0; i<n; i++) {
in.open(files.elementAt(i));
while ((line = in.readLine()) != null) {
...
}
in.close();
}
This class contains a self-benchmarking
test in its main() method that can be used to measure
the exact speedup on a particular system.
Figure 6. BufferedReader documentation
-
-
Although the example in Figure 5 is as much as 45
times faster than the example in Figure 1 (and actually comprises fewer
lines of code), it is still far from the best that can be done. There
are at least two more major optimizations that can be done if still higher
performance is required and we are willing to do a little more
work.
First, if we look at the first line of the while
loop, we see that a new String object is being created for every line of
the file being read:
while ((line = in.readLine()) != null) {
This means, for example, that for a 100,000 line
file 100,000 String objects would be created. Creating a large number
of objects incurs costs in three ways:
-
Time and memory to allocate the space for the objects.
-
Time to initialize the objects.
-
Time to garbage collect the objects.
The problem here is that the I/O buffer is private;
the user cannot access it directly. Therefore, BufferedFileReader
must create a new String object in order to return the data to the user.
Although this follows the conventional assertion that class structures should
largely be private in order to control data access, the performance penalty
is to high an insurance premium for this case.
To get around this problem, the user must manage
the buffer directly without using the BufferedReader or BufferedFileReader
convenience classes. This will enable the user to reuse buffers rather
than creating a new object each time to hold the data.
Second, Strings are inherently less efficient
than arrays based upon char. This is because the user must call a
method to access each character of a String, whereas the characters can
be accessed directly in a char array. Hence, our code example can
be made more efficient by avoiding Strings entirely, and using char arrays
directly.
Figure 7 shows the code which implements the two
optimizations above. It is substantially more lines of code than the previous
examples, but tests show it performs as much as 3 times faster than the
example in Figure 5.
// This example uses character arrays instead of Strings.
// It doesn't use BufferedReader or BufferedFileReader, but does
// the buffering by itself so that it can avoid creating too many
// String objects. For simplicity, it assumes that no line will be
// longer than 128 characters.
FileReader fr;
int nlines = 0;
char buffer[] = new char[8192 + 1];
int maxLineLength = 128;
//assumes no line is longer than this
char lineBuf[] = new char[maxLineLength];
for (int i=0; i < nargs; i++) {
try {
fr = new FileReader(args[i]);
int nChars = 0;
int nextChar = 0;
int startChar = 0;
boolean eol = false;
int lineLength = 0;
char c = 0;
int n;
int j;
while (true) {
if (nextChar >= nChars) {
n = fr.read(buffer, 0, 8192);
if (n == -1) { // EOF
break;
}
nChars = n;
startChar = 0;
nextChar = 0;
}
for (j=nextChar; j < nChars; j++) {
c = buffer[j];
if ((c == '\n') || (c == '\r')) {
eol = true;
break;
}
}
nextChar = j;
int len = nextChar - startChar;
if (eol) {
nextChar++;
if ((lineLength + len) > maxLineLength) {
// error
} else {
System.arraycopy(buffer, startChar, lineBuf, lineLength, len);
}
lineLength += len;
//
// Process line here
//
nlines++;
if (c == '\r') {
if (nextChar >= nChars) {
n = fr.read(buffer, 0, 8192);
if (n != -1) {
nextChar = 0;
nChars = n;
}
}
if ((nextChar < nChars) && (buffer[nextChar] == '\n'))
nextChar++;
}
startChar = nextChar;
lineLength = 0;
continue;
}
if ((lineLength + len) > maxLineLength) {
// error
} else {
System.arraycopy(buffer, startChar, lineBuf, lineLength, len);
}
lineLength += len;
}
fr.close();
} catch (Exception e) {
System.out.println("exception: " + e);
}
}
Figure 7. Example with user-managed buffers and char arrays
-
-
The results of running the test program used for this paper, JavaIOTest,
on text files ranging from 100 kilobytes to 500 kilobytes in size are summarized
in the tables below the Sun Solaris 2.6 platform. The relative performance
numbers are more important than the absolute numbers since the system was
not isolated nor used exclusively for just the test processes. As the automobile
industry states in its disclaimers, "your mileage may vary". Readers are
again cautioned that important is the relative improvement on the same
system, test-to-test, and that comparisons across operating environments
and systems are complex and the results can be specious.
Table 1. Sun Solaris 2.6
Small File I/O Performance Comparison
Java I/O Class Performance
for Small
File (100 kilobytes)
|
|
|
|
|
|
Sequential
Time Reduction
(Pct.)
|
Aggregate
Time Reduction
(Pct.)
|
|
|
|
|
DataInputStream(BufferedInputStream):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Source: JavaIOTest.java output to Java console via println.
Environment:
Sun SPARCstationTM
5, Sun Solaris 2.6 Beta, Sun Solaris
SPARCTM
Edition Beta JDK/JIT v1.1.3 Green Threads.
Table 2. Sun Solaris 2.6
Medium File I/O Performance Comparison
Java I/O
Class Performance
for Medium
File (250 kilobytes)
|
|
|
|
|
|
Sequential
Time Reduction
(Pct.)
|
Aggregate
Time Reduction
(Pct.)
|
|
|
|
|
DataInputStream(BufferedInputStream):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Source: JavaIOTest.java
output to Java console via println.
Environment: Sun SPARCstation
5, Sun Solaris 2.6 Beta, Sun Solaris SPARC Edition Beta JDK/JIT v1.1.3
Green Threads.
Table 6. Sun Solaris
2.6 Large File I/O Performance Comparison
Java I/O
Class Performance
for Large
File (500 kilobytes)
|
|
|
|
|
|
Sequential
Time Reduction
(Pct.)
|
Aggregate
Time Reduction
(Pct.)
|
|
|
|
|
DataInputStream(BufferedInputStream):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Source: JavaIOTest.java
output to Java console via println.
Environment: Sun SPARCstation
5, Sun Solaris 2.6 Beta, Sun Solaris SPARC Edition Beta JDK/JIT v1.1.3 Green Threads.
-
-
To use the performance profiler in Java WorkShop 2.0, simply click
the Profile button on the main toolbar (the third button from the left;
represented by the watch icon). The profiler runs the program
and collects performance information, and then displays the results in
the Profile Results window. The results are saved in a file called ProjectName.prof.
If a set of results needs to be saved in order to compare them to subsequent
profiles, the file should be renamed. The File -> Open command in the Profile
Results window will load a chosen .prof file. The Profile Results
window's menu commands allow the display of the results in different ways.
Of particular interest is the ability to filter out system routines (or
include them) in the results. The results can be displayed as cummulative
execution times and number of times a method is called along with the
callers of that method.
-
-
Package Contents
- JavaIOTest.java, JavaIOTest.class the source and
compiled test application.
- sunsoft/util/io/BufferedFileReader the custom I/O class.
- small.txt, medium.txt, large.txt various text files
for testing file reading schemes.
- The Java WorkShop 2.0 project files for the project
so that the profiler can be used.
Installing the files (instructions vary by platform)
- Sun Solaris (SPARC/X86)
- Download the compressed tar file for the source,
JavaIOTest.tar.Z, containing the source and test test files.
- Move the file to the directory you choose to be the parent directory.
- Unpack the source using "zcat JavaIOTest.tar.Z | tar -xvf -"
- Windows 95/NT
- Download the compressed tar file for the source, JavaIOTest.zip,
containing the source and test test files.
- Move the file to the directory you choose to be the parent directory.
- Unpack the source using WinZip (or other compatible Windows zip-file utility)
Running the Test Program
Tips and Troubleshooting
- Be sure that the CLASSPATH variable includes the current path so that the
java tools can find the sunsoft/util/io directory containing the BufferedFileReader
class custom I/O class. the easiest way is to include "." in your class
path and place the sunsoft root folder in your project directory.
-
-
View source code: JavaIOTest.java
|