This is likely not the most recent version. Click here for the current page without highlighting.
This is Google's cache of www.sun.com/workshop/java/wp-javaio/ as retrieved on Fri, 31 Mar 2000 14:47:22 GMT.
To show your matches, we have used the snapshot of this page that we took as we crawled the web.

Google is not affiliated with the authors of cached pages or their content.
These search terms have been highlighted: jdk 1.3 performance 

Java I/O Performance Tuning
Sun Microsystems, Inc.
spacer spacer
spacer   sun.com My Sun | Regional Sites | Site Index | How To Buy 
spacer
black dot
 
black fade
spacer
  Home Products & Solutions Software Development Tools
spacer spacer
more APIs, Components & Frameworks
-  Application Servers
more Community Source Licensing
more Development and Testing Tools
more Visual WorkShop C++
more Performance WorkShop Fortran
more Sun WorkShop TeamWare
more Sun WorkShop Professional C
more Java Blend
more Java Message Queue
more SunXTL
spacer
more Internet/Intranet Services and Clients
more Multimedia
more Network Connectivity
more Operating Environments & Platforms
more PC Interoperability
more Personal Productivity
more Security
more Server Performance Software
more Storage Software
more System & Network Management
more Intel Solutions
more Download Center
spacer
spacer
spacer spacer
spacer
  Related:
-  Try & Buy
-  Licensing Info
-  White Papers
-  Hot News
-  Awards
spacer
spacer spacer
spacer
  See Also:
-  Support
-  Sun Developer Connection
spacer

spacer Product Overview   Tech Info   Support   Software Download  

JavaTM I/O Performance Tuning

Authors: Daniel Lord
Technology Manager, JavaTM and Developer Products
Sun Microsystems, Inc.

Achut Reddy
Staff Engineer
Sun Microsystems, Inc.

White Paper



Also available in postscript

Table of Contents:

Summary

Many JavaTM programs that utilize I/O are excellent candidates for performance tuning. One of the more common problems in Java applications is inefficient I/O. A profile of Java applications and applets that handle significant volumes of data will show significant time spent in I/O routines, implying substantial gains can be had from I/O performance tuning. In fact, the I/O performance issues usually overshadow all other performance issues, making them the first area to concentrate on when tuning performance. Therefore, I/O efficiency should be a high priority for developers looking to optimally increase performance. Unfortunately, optimal reading and writing can be challenging in Java. This white paper will assist developers in overcoming that challenge.

Once an application's reliance upon I/O is established and I/O is determined to account for a substantial slice of the applications execution time, performance tuning can be undertaken. The best method for determining the distribution of execution time among methods is to use a profiler. SunTM JavaTM WorkShopTM software provides an excellent profile that offers detailed call counts and execution times for each method. System method call statistics can be tabulated as an option. Stream chaining and custom I/O class methods of performance tuning are discussed. An example program is provided that allows the progressive measurement of the progress of the tuning effort. Using the example program that is provided, JavaIOTest.java, and utilizing the techniques described, substantial performance improvements of an order of magnitude can be achieved. Simple stream chaining provides approximately a 91% decrease in execution time from 28,198 milliseconds to 2,510 milliseconds, while a custom BufferedFileReader class cuts performance time by another 75%, over 97% total, to 630 milliseconds for a 250 kilobyte text file on The SunTM SolarisTM 2.6 operating environment.

Introduction

Java performance is currently a topic of great interest. Performance is usually hotly debated for any relatively new language or operating environment, so this is not surprising. However, Java's reliance upon the availability of sufficient network bandwidth for the downloading of classes shifts the relative benefits of some options for optimization. The reliance on the network penalizes optimization techniques that favor increasing code size in order to provide faster execution. The resulting optimized classes can take longer to download to the client. Of course, server-side Java is not as acutely affected by code size and developers can even consider native code compilers for that case. Based upon anecdotal evidence, most Java development today seems to be concentrated on client-side applets with the result that download times are an important criterion. Java optimization efforts, therefore, need to be well-researched and considered.

Because Java is a relatively new language, optimizing compiler features are less sophisticated that those available for C and C++, leaving room for more "hand-crafting". The "hand" optimization of key sections identified by profilers such as the profiler available in Sun's Java WorkShop 2.0 can reap substantial benefits.

One of the more common problems in Java applications is inefficient I/O. A profile of Java applications and applets that handle significant volumes of data will show significant time spent in I/O routines, implying substantial gains can be had from I/O performance tuning. In fact, the I/O performance issues, usually overshadow all other performance issues making them the first area to concentrate on when tuning performance. Therefore, I/O efficiency should be a high priority for developers looking to optimally increase performance. Unfortunately, optimal reading and writing can be challenging in Java. Streamlining the use of I/O often results in greater performance gains than all other possible optimizations combined. It is not uncommon to see a speed improvement of at least an order of magnitude using efficient I/O techniques as this paper and the example program will demonstrate.

This white paper focuses on the improvement gains possible through careful use of both the existing Java I/O classes and the introduction of a custom file reader, BufferedFileReader. BufferedFileReader is responsible for some of the performance increase of Java WorkShop version 2.0 over version 1.0. An example application is used to read three different file sizes, ranging from 100 kilobytes to 500 kilobytes and the results are compared for various optimizations. The source for the example used in this white paper is available along with this paper.

Performance Tuning Through Stream Chaining

As a demonstration of I/O performance tuning, this paper will describe the process of tuning a sample program created expressly for this paper: JavaIOTest. JavaIOTest tracks the execution times for several I/O schemes starting with a very basic DataInputStream method and culminating with the use of a custom-buffered, file-reader class, while demonstrating the performance improvements obtained by several program design changes during the tuning effort. The actual execution times discussed and detailed in Appendix One are meant to show the relative improvements possible. The actual execution times will vary widely among the systems used. Readers are cautioned that important is the relative improvement on the same system, test-to-test, and that comparisons across operating environments and systems are complex and the results can be specious.

Basic IO: DataInputStream

The I/O method used in this section is a DataInputStream chained to a FileInputStream as shown in Figure 1. This method of reading a file is very common since it is simple, but it is extremely slow. The reason for the poor performance is that the DataInputStream class does no buffering. The resulting reads are done one byte at a time. Several instances of this technique have been found in the JDKTM software as well as several "real" Java programs, providing fertile ground for improvement through a tuning regime.


try {
    DataInputStream in =
	new DataInputStream(new FileInputStream(args[i]));
    while ((line = in.readLine()) != null) {
    nlines++;
    }
    in.close();
} catch (Exception e) {
System.out.println(" DISBISTest: exception:" + e );
}

Figure 1. DataInputStream.readLine()

The results of using the default, basic I/O scheme are as follows. The first section of the example program, JavaIOTest, showed run times of  28,198 milliseconds reading a 250 kilobyte file. The full test results are illustrated in Appendix One.

An Improvement: BufferedInputStream

A simple improvement involves buffering the FileInputStream by interposing a BufferedInputStream in the stream chain. This buffers the data, with the default buffer size of 2048 bytes. Figure 2 illustrates the minor source code change required.

try { FileInputStream fs = new FileInputStream(args[i]);
DataInputStream in = new DataInputStream(new BufferedInputStream(fs));

while ((line = in.readLine()) != null) {

nlines++; }
in.close();
} catch (Exception e) { System.out.println(" DISBISTest: exception:" + e ); }
Figure 2. BufferedInputStream data buffering

The resulting performance increase for the medium sized file (250 kilobytes) was 91%, from 28,198 milliseconds to 2,510 -- over an order of  magnitude with just a simple change. The full results are summarized in the tables in Appendix One.
 

The New JDK 1.1 Classes

The foregoing method has provided a substantial performance improvement but has a serious flaw: the readLine() method of DataInputStream does not properly handle Unicode characters. The problem is that the method assumes all characters are one byte in length while Unicode characters are two bytes in length. This method has been deprecated beginning in JDK 1.1. Since deprecated classes are discouraged, the FileReader and BufferedReader classes should be substituted for the classes. Unfortunately, the scheme to provide for Unicode character localization consists of invoking a locale-dependent converter on the raw bytes to convert them to Java characters, causing an extra copy operation per character. This penalty is offset by other efficiencies in the code. The code change is shown in Figure 3.


try {
    BufferedReader in = new BufferedReader(new FileReader(args[i]), 8192);

    while ((line = in.readLine()) != null) {
    nlines++;
    }
    in.close();
} catch (Exception e) {
System.out.println(" BR8192Test: exception:" + e );
}
Figure 3. BufferedReader using default buffer size

The resulting performance increase for the medium file size was 57%, from 2,510 to 1,092 milliseconds. The full results are summarized in the tables in Appendix One.

Buffer Size Effects

The buffer size used in buffering schemes is important for performance. As a rule of thumb, bigger is better to a point. In order to examine the impact of the buffer size, a test run was made with a smaller buffer than  the default of 8,192 bytes used in the BufferedReader class. Figure 4 shows the code segment using a reduced buffer size of 1,024 bytes.

try { BufferedReader in = new BufferedReader(new FileReader(args[i]), 1024);
while ((line = in.readLine()) != null) { nlines++; }
in.close();
} catch (Exception e) { System.out.println(" BR1024Test: exception:" + e ); }
 
Figure 4. BufferedReader using small (1024 byte) buffer size

Depending upon the files size and platform used for testing, the larger buffer size provided performance improvements ranging from 3% to 13%. The use of a large buffer sizes will improve performance significantly and shoud be considered unless local memory is restricted.

Summary

Using simple stream-chaining techniques, the execution performance of an I/O bound Java program has been increased an average of 97% over using the simple DataInputStream class. A substantial improvement for a little extra design work and one that could mean the difference between shipping and re-designing an interactive application.

Tuning with Custom I/O Classes

To this point, tuning has focused on using the core classes distributed with the JDK. With each version of the JDK, more effort seems to be going into tuning critical sections for performance. The improvement in speed of the BufferedReader class over the BufferedInputStream class despite the additional copy per character hints at this. However, if the application needs to read large files, a custom class can be created to further tune performance. The BufferedReader.readLine() method creates an instance of StringBuffer to hold the characters in the line it reads. It then converts the StringBuffer to String, resulting in two more copies per character. The BufferedFileReader class utilizes a modified readLine() method that avoids the extra, double-copy in most cases. It also adds the convenience of creating the FileReader class for the caller. Figure 5 shows the changes required to use this class. The resulting performance increase for the medium file size was 32% overall to 630 milliseconds. The full results are summarized in the tables in Appendix One.

BufferedFileReader in = new BufferedFileReader();
for (int i=0; i < nargs; i++) {
    try {
        in.open(args[i]);
        while ((line = in.readLine()) != null) {
            nlines++;
        } in.close();
} catch (Exception e) {
    System.out.println(" BFRTest: exception:" + e );
}

Figure 5. BufferedFileReader

The BufferedFileReader class is being used in Java WorkShop (package sun.jws.util). The documentation comment in Figure 6 describes the efficiencies added.

Provides a single, efficient class through which a file may be read, without having to chain together several different classes, as with the standard JDK classes. It is also more efficient (typically faster) than the fastest JDK classes. Specific optimizations include:

  1. More efficiently coded readLine( ) method.  Avoids
  2. adds open( ) method, so the class can be reused when several files are read in a loop.  This avoids repeated allocation and deallocation of buffers.

Performs the correct byte-to-char conversion, so that it works properly unicode for and i18n.

Example 1 - single file:


eBufferedFileReader in = new BufferedFileReader("Foo.java");
while ((line = in.readLine()) != null) {
...
}

Example 2 - multiple files:


BufferedFileReader in = new BufferedFileReader();
int n = files.size();
for (int i=0; i<n; i++) {
    in.open(files.elementAt(i));
    while ((line = in.readLine()) != null) {
    ...
    }
    in.close();
}

This class contains a self-benchmarking test in its main() method that can be used to measure the exact speedup on a particular system.

Figure 6. BufferedReader documentation

Further Tuning

Although the example in Figure 5 is as much as 45 times faster than the example in Figure 1 (and actually comprises fewer lines of code), it is still far from the best that can be done. There are at least two more major optimizations that can be done if still higher performance is required and we are willing to do a little more work.

First, if we look at the first line of the while loop, we see that a new String object is being created for every line of the file being read:

while ((line = in.readLine()) != null) {

This means, for example, that for a 100,000 line file 100,000 String objects would be created. Creating a large number of objects incurs costs in three ways:

  1. Time and memory to allocate the space for the objects.
  2. Time to initialize the objects.
  3. Time to garbage collect the objects.
The problem here is that the I/O buffer is private; the user cannot access it directly.  Therefore, BufferedFileReader must create a new String object in order to return the data to the user. Although this follows the conventional assertion that class structures should largely be private in order to control data access, the performance penalty is to high an insurance premium for this case.

To get around this problem, the user must manage the buffer directly without using the BufferedReader or BufferedFileReader convenience classes. This will enable the user to reuse buffers rather than creating a new object each time to hold the data.

Second, Strings are inherently less efficient than arrays based upon char. This is because the user must call a method to access each character of a String, whereas the characters can be accessed directly in a char array. Hence, our code example can be made more efficient by avoiding Strings entirely, and using char arrays directly.

Figure 7 shows the code which implements the two optimizations above. It is substantially more lines of code than the previous examples, but tests show it performs as much as 3 times faster than the example in Figure 5.


// This example uses character arrays instead of Strings.
// It doesn't use BufferedReader or BufferedFileReader, but does
// the buffering by itself so that it can avoid creating too many
// String objects.  For simplicity, it assumes that no line will be
// longer than 128 characters.

FileReader fr;
int nlines = 0;
char buffer[] = new char[8192 + 1];
int maxLineLength = 128;

//assumes no line is longer than this
char lineBuf[] = new char[maxLineLength];
for (int i=0; i < nargs; i++) {
try {
    fr = new FileReader(args[i]);

    int nChars = 0;
    int nextChar = 0;
    int startChar = 0;
    boolean eol = false;
    int lineLength = 0;
    char c = 0;
    int n;
    int j;

    while (true) {
  	if (nextChar >= nChars) {
	n = fr.read(buffer, 0, 8192);
	if (n == -1) {  // EOF
	    break;
	}
    nChars = n;
    startChar = 0;
    nextChar = 0;
    }

    for (j=nextChar; j < nChars; j++) {
    	c = buffer[j];
	if ((c == '\n') || (c == '\r')) {
	    eol = true;
	    break;
	}
    }
    nextChar = j;

    int len = nextChar - startChar;
    if (eol) {
	nextChar++;
 	if ((lineLength + len) > maxLineLength) {
	    // error
    	} else {
	    System.arraycopy(buffer, startChar, lineBuf, lineLength, len);
    	}
        lineLength += len;

        //
        // Process line here
        //
        nlines++;
  
	if (c == '\r') {
    	    if (nextChar >= nChars) {
	        n = fr.read(buffer, 0, 8192);
	        if (n != -1) {
		    nextChar = 0;
		    nChars = n;
	        }
	   }

   	   if ((nextChar < nChars) && (buffer[nextChar] == '\n'))
	       nextChar++;
    	}
        startChar = nextChar;
        lineLength = 0;
        continue;
    }

    if ((lineLength + len) > maxLineLength) {
    	// error
    } else {
    	System.arraycopy(buffer, startChar, lineBuf, lineLength, len);
    }
    	lineLength += len;
    }
    fr.close();
} catch (Exception e) {
    System.out.println("exception: " + e);
    }
}
Figure 7. Example with user-managed buffers and char arrays

Appendix One: Performance Tuning Results

The results of running the test program used for this paper, JavaIOTest, on text files ranging from 100 kilobytes to 500 kilobytes in size are summarized in the tables below the Sun Solaris 2.6 platform. The relative performance numbers are more important than the absolute numbers since the system was not isolated nor used exclusively for just the test processes. As the automobile industry states in its disclaimers, "your mileage may vary". Readers are again cautioned that important is the relative improvement on the same system, test-to-test, and that comparisons across operating environments and systems are complex and the results can be specious.

Table 1. Sun Solaris 2.6 Small File I/O Performance Comparison

    Java I/O Class Performance
    for Small File (100 kilobytes)
     
     
     
    File I/O Class(es)
    Time (ms)
    Sequential Time Reduction 
    (Pct.)
    Aggregate Time Reduction 
    (Pct.)
    DataInputStream:
    17264
    -
    -
    DataInputStream(BufferedInputStream):
    1468
    -91.5%
    -91.5%
    BufferedReader(1024):
    688
    -53.1%
    -96.0%
    BufferedReader(8192):
    613
    -10.9%
    -96.5%
    BufferedFileReader:
    382
    -37.7%
    -97.8%

Source: JavaIOTest.java output to Java console via println.
Environment: Sun SPARCstationTM 5, Sun Solaris 2.6 Beta, Sun Solaris SPARCTM Edition Beta JDK/JIT v1.1.3 Green Threads.


Table 2. Sun Solaris 2.6 Medium File I/O Performance Comparison

    Java I/O Class Performance  
    for Medium File (250 kilobytes)
     
     
     
    File I/O Class(es)
    Time
    (ms)
    Sequential Time Reduction
    (Pct.)
    Aggregate Time Reduction
    (Pct.)
    DataInputStream:
    28198
    -
    -
    DataInputStream(BufferedInputStream):
    2510
    -91.1%
    -91.1%
    BufferedReader(1024):
    1092
    -56.5%
    -96.1%
    BufferedReader(8192):
    940
    -13.9%
    -96.7%
    BufferedFileReader:
    630
    -33.0%
    -97.8%

Source: JavaIOTest.java output to Java console via println.
Environment: Sun SPARCstation 5, Sun Solaris 2.6 Beta, Sun Solaris SPARC Edition Beta JDK/JIT v1.1.3 Green Threads.


Table 6. Sun Solaris 2.6 Large File I/O Performance Comparison  

    Java I/O Class Performance  
    for Large File (500 kilobytes)
     
     
     
    File I/O Class(es)
    Time (ms)
    Sequential Time Reduction
    (Pct.)
    Aggregate Time Reduction
    (Pct.)
    DataInputStream:
    60646
    -
    -
    DataInputStream(BufferedInputStream):
    5484
    -91.0%
    -91.0%
    BufferedReader(1024):
    2275
    -58.5%
    -96.2%
    BufferedReader(8192):
    2141
    -5.9%
    -96.5%
    BufferedFileReader:
    1342
    -37.3%
    -97.8%

Source: JavaIOTest.java output to Java console via println.
Environment: Sun SPARCstation 5, Sun Solaris 2.6 Beta, Sun Solaris SPARC Edition Beta JDK/JIT v1.1.3 Green Threads.

Appendix Two: Profiling in Java WorkShop 2.0

To use the performance profiler in Java WorkShop 2.0, simply click the Profile button on the main toolbar (the third button from the left; represented by the watch icon). The profiler runs the program and collects performance information, and then displays the results in the Profile Results window. The results are saved in a file called ProjectName.prof. If a set of results needs to be saved in order to compare them to subsequent profiles, the file should be renamed. The File -> Open command in the Profile Results window will load a chosen .prof file. The Profile Results window's menu commands allow the display of the results in different ways. Of particular interest is the ability to filter out system routines (or include them) in the results. The results can be displayed as cummulative execution times and number of times a method is called along with the callers of that method.  

Appendix Three: Downloading,Installation, and Troubleshooting

  Package Contents
  • JavaIOTest.java, JavaIOTest.class the source and compiled test application.
  • sunsoft/util/io/BufferedFileReader the custom I/O class.
  • small.txt, medium.txt, large.txt various text files for testing file reading schemes.
  • The Java WorkShop 2.0 project files for the project so that the profiler can be used.
 

Installing the files (instructions vary by platform)

  • Sun Solaris (SPARC/X86)  
    • Download the compressed tar file for the source, JavaIOTest.tar.Z, containing the source and test test files.
    • Move the file to the directory you choose to be the parent directory.
    • Unpack the source using "zcat JavaIOTest.tar.Z | tar -xvf -"

  • Windows 95/NT
    • Download the compressed tar file for the source, JavaIOTest.zip, containing the source and test test files.
    • Move the file to the directory you choose to be the parent directory.
    • Unpack the source using WinZip (or other compatible Windows zip-file utility)

Running the Test Program 

    Running from the shell
    • Sun Solaris:   java JavaIOTest <text-file(s)-name-to-read>

    • example:  java JavaIOTest small.txt  medium.txt
       
    • Microsoft Windows : java JavaIOTest <text-file(s)-name-to-read>

    • example: java JavaIOTest c:\JavaIOTest\small.txt c:\JavaIOTest\medium.txt

    Running from Java WorkShop

    • Be sure to edit the project file to add the paths to the text files to read. Accomplish this by selecting the menu Project -> Edit and type the file names in the Program Arguments text box on the Run tab-panel. Make sure to build the project as an application.
     

Tips and Troubleshooting 

  • Be sure that the CLASSPATH variable includes the current path so that the java tools can find the sunsoft/util/io directory containing the BufferedFileReader class custom I/O class. the easiest way is to include "." in your class path and place the sunsoft root folder in your project directory.

Appendix Four: JavaIOTest.java Source Code

View source code: JavaIOTest.java

spacer
Development Tools : Visual WorkShop C++Performance WorkShop FortranSun WorkShop TeamWareSun WorkShop Professional CJava BlendJava Message QueueSunXTL

webtone webtone webtone webtone
 Copyright 1994-2000 Sun Microsystems, Inc.,  901 San Antonio Road, Palo Alto, CA 94303 USA. All rights reserved.
 Terms of Use. Privacy Policy. Feedback
spacer
  spacer spacer spacer spacer