How System.out.println() really works

A few days ago I came across an interesting article, Where the printf() Rubber Meets the Road, describing how the printf function ‘works’ on the low level.

Commonly asked by Java beginners is the question, “How does System.out.println() work?”; the above blog post inspired me to do some research into this question. In this blog post I’ll attempt to provide an explanation of what System.out.println() does behind the scenes.

Most of the relevant code can be found on OpenJDK, which is the primary implementation of Java itself. System.out.println() may be implemented completely differently on other Java platforms.

I will warn you now that this article is very long and not for the easily bored.

First steps

Our first step in figuring out how System.out.println works is by first understanding what System.out is and how it came to be.

Let’s take a look through the OpenJDK’s Mercurial online repository. Digging around a bit, we find System.java. In this file System.out is declared:

public final static PrintStream out = nullPrintStream();

But when we find the code for nullPrintStream():

private static PrintStream nullPrintStream() throws NullPointerException {
    if (currentTimeMillis() > 0) {
        return null;
    }
    throw new NullPointerException();
}

So nullPrintStream() simply returns null or throws an exception. This can’t be it. What’s going on here?

The answer can be found in the function initializeSystemClass(), also in System.java:

FileOutputStream fdOut = new FileOutputStream(FileDescriptor.out);
setOut0(new PrintStream(new BufferedOutputStream(fdOut, 128), true));

There’s a lot of stuff going on in this code. I’m going to refer back to this two lines of code later, but setOut0() is what actually initializes System.out.

The function setOut0() is a native function. We can find its implementation in System.c:

JNIEXPORT void JNICALL
Java_java_lang_System_setOut0(JNIEnv *env, jclass cla, jobject stream)
{
    jfieldID fid =
        (*env)->GetStaticFieldID(env,cla,"out","Ljava/io/PrintStream;");
    if (fid == 0)
        return;
    (*env)->SetStaticObjectField(env,cla,fid,stream);
}

This is pretty standard JNI code that sets System.out to the argument passed to it.

At first all this deal with setting System.out to nullPrintStream() and later setting it with JNI seems entirely unnecessary. But this is actually justified.

In Java, static fields are initialized first, and everything else comes after. So even before the JVM and the System class is fully initialized, the JVM tries to initialize System.out.

Unfortunately at this point the rest of the JVM isn’t properly initialized so it’s impossible to reasonably set System.out at this point. The best that could be done would be to set it to null.

The System class, along with System.out is properly initialized in initializeSystemClass() which is called by the JVM after static and thread initialization.

There is a problem, however. System.out is final, meaning we cannot simply set it to something else in initializeSystemClass(). There’s a way around that, however. Using native code, it is possible to modify a final variable.

Wait, what’s a FileDescriptor?

Notice this line of code:

FileOutputStream fdOut = new FileOutputStream(FileDescriptor.out);

A FileOutputStream object is created from something referred to as FileDescriptor.out.

The FileDescriptor class, though part of java.io, is rather elusive. It can’t be found in the java.io directory in OpenJDK.

This is because FileDescriptor is much lower level than most of the Java standard library. While most .java files are platform independent, there are actually different implementations of FileDescriptor for different platforms.

We’ll be using the Linux/Solaris version of FileDescriptor.java.

A FileDescriptor object is very simple. Essentially all it really holds is an integer. It holds some other data too, which aren’t really important. The constructor of FileDescriptor takes an integer and creates a FileDescriptor containing that integer.

The only use of a FileDescriptor object is to initialize a FileOutputStream object.

Let’s see how FileDescriptor.out is defined:

public static final FileDescriptor out = new FileDescriptor(1);

FileDescriptor.out is defined as 1, in as 0, and err as 2. The basis of these definitions are from a very low level somewhere in Unix.

We now know how System.out is initialized. For now, we’re going to leave behind the FileDescriptor; we only need to know what it does.

A tour through java.io

Now we redirect our attentions to the println() function of PrintStream.

PrintStream is a comparably higher level class, capable of writing many different kinds of data, flushing and handling errors for you without much effort.

Let’s see how println() is defined in PrintStream.java:

public void println(String x) {
    synchronized (this) {
        print(x);
        newLine();
    }
}

Following the call stack to print():

public void print(String s) {
    if (s == null) {
        s = "null";
    }
    write(s);
}

Going deeper, and looking at write():

private void write(String s) {
    try {
        synchronized (this) {
            ensureOpen();
            textOut.write(s);
            textOut.flushBuffer();
            charOut.flushBuffer();
            if (autoFlush && (s.indexOf('\n') >= 0))
                out.flush();
        }
    }
    catch (InterruptedIOException x) {
        Thread.currentThread().interrupt();
    }
    catch (IOException x) {
        trouble = true;
    }
}

Internally, the PrintStream object (System.out) contains three different objects to do its work:

  • The OutputStreamWriter object (charOut), writing character arrays into a stream
  • The BufferedWriter object (textOut), writing not only character arrays but also strings and text
  • A BufferedOutputStream object (out), passed all the way down the call stack and used much lower then at the PrintStream level

We can see that PrintStream.write() calls BufferedWriter.write() and flushes both buffers. I’m not sure why it’s necessary to flush the charOut buffer, so I’m going to ignore that.

Delving deeper, let’s find the implementation of write() in BufferedWriter.java.. wait it’s not here. The function write(String) is actually defined in the abstract class Writer.java:

public void write(String str) throws IOException {
    write(str, 0, str.length());
}

Moving back to BufferedWriter:

public void write(String s, int off, int len) throws IOException {
    synchronized (lock) {
        ensureOpen();

        int b = off, t = off + len;
        while (b < t) {
            int d = min(nChars - nextChar, t - b);
            s.getChars(b, b + d, cb, nextChar);
            b += d;
            nextChar += d;
            if (nextChar >= nChars)
                flushBuffer();
        }
    }
}

As its name suggests, BufferedWriter is buffered. Data is stored in a data buffer until it’s written all at once, or flushed. Buffered IO is much faster than simply writing to the hardware one byte at a time.

The function BufferedWriter.write() doesn’t actually write anything. It only stores something in an internal buffer. The flushing is not done here, but back at PrintStream.write().

Let’s go to flushBuffer(), in the same file:

void flushBuffer() throws IOException {
    synchronized (lock) {
        ensureOpen();
        if (nextChar == 0)
            return;
        out.write(cb, 0, nextChar);
        nextChar = 0;
    }
}

We find yet another write() call, on a Writer object (out). The out object here is the charOut object of PrintStream, and has the type OutputStreamWriter. This object is also the same object as charOut in PrintStream.

Let’s look at OutputStreamWriter.write() in OutputStreamWriter.java:

public void write(char cbuf[], int off, int len) throws IOException {
    se.write(cbuf, off, len);
}

This now transfers the job to another object, se. This object is of type sun.nio.cs.StreamEncoder. We’re going to leave the java.io directory for a while.

Let’s see the implementation of StreamEncoder.write() in StreamEncoder.java:

public void write(char cbuf[], int off, int len) throws IOException {
    synchronized (lock) {
        ensureOpen();
        if ((off < 0) || (off > cbuf.length) || (len < 0) ||
                ((off + len) > cbuf.length) || ((off + len) < 0)) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return;
        }
        implWrite(cbuf, off, len);
    }
}

Moving on to StreamEncoder.implWrite():

void implWrite(char cbuf[], int off, int len)
    throws IOException
{
    CharBuffer cb = CharBuffer.wrap(cbuf, off, len);

    if (haveLeftoverChar)
        flushLeftoverChar(cb, false);

    while (cb.hasRemaining()) {
        CoderResult cr = encoder.encode(cb, bb, false);
        if (cr.isUnderflow()) {
            assert (cb.remaining() <= 1) : cb.remaining();
            if (cb.remaining() == 1) {
                haveLeftoverChar = true;
                leftoverChar = cb.get();
            }
            break;
        }
        if (cr.isOverflow()) {
            assert bb.position() > 0;
            writeBytes();
            continue;
        }
        cr.throwException();
    }
}

Again this calls another function, writeBytes(). Here’s the implementation:

private void writeBytes() throws IOException {
    bb.flip();
    int lim = bb.limit();
    int pos = bb.position();
    assert (pos <= lim);
    int rem = (pos <= lim ? lim - pos : 0);

    if (rem > 0) {
        if (ch != null) {
            if (ch.write(bb) != rem)
                assert false : rem;
        } else {
            out.write(bb.array(), bb.arrayOffset() + pos, rem);
        }
    }
    bb.clear();
}

We’re done with StreamEncoder. This class essentially processes or encodes character streams, but ultimately delegates the task of writing the bytes back to BufferedOutputStream.

Let’s take a look at the code for write() in BufferedOutputStream.java:

public synchronized void write(byte b[], int off, int len) throws IOException {
    if (len >= buf.length) {
        /* If the request length exceeds the size of the output buffer,
           flush the output buffer and then write the data directly.
           In this way buffered streams will cascade harmlessly. */
        flushBuffer();
        out.write(b, off, len);
        return;
    }
    if (len > buf.length - count) {
        flushBuffer();
    }
    System.arraycopy(b, off, buf, count, len);
    count += len;
}

And BufferedOutputStream passes the baton again, this time to FileOutputStream. Remember when we instantiated fdOut as a FileOutputStream? Well, this is it, passed down through dozens of system calls.

Believe it or not, FileOutputStream is the final layer before JNI. We see the function write() in FileOutputStream.java:

public void write(byte b[], int off, int len) throws IOException {
    writeBytes(b, off, len);
}

And writeBytes():

private native void writeBytes(byte b[], int off, int len) throws IOException;

We’ve reached the end of the Java part. But we’re not quite finished.

A Review of the java.io call stack

This is a ‘contains’ chart:

Also here’s the entire call stack:

Stepping into the JNI

After FileOutputStream, the writing of bytes to the console is handled natively. Much of this native code is platform dependent: there are different versions of the code for Windows and Linux. We’re going to deal with the Linux versions first.

The native implementation of writeBytes() is defined in FileOutputStream_md.c.

JNIEXPORT void JNICALL
Java_java_io_FileOutputStream_writeBytes(JNIEnv *env,
    jobject this, jbyteArray bytes, jint off, jint len) {
    writeBytes(env, this, bytes, off, len, fos_fd);
}

The field fos_fd is the integer stored in the FileDescriptor object that we’ve visited so long ago. So for the out stream, fos_fd should be 1.

We’re just calling a method, writeBytes, with the additional argument of fos_id. The implementation of writeBytes() is defined in io_util.c:

void
writeBytes(JNIEnv *env, jobject this, jbyteArray bytes,
           jint off, jint len, jfieldID fid)
{
    jint n;
    char stackBuf[BUF_SIZE];
    char *buf = NULL;
    FD fd;

    if (IS_NULL(bytes)) {
        JNU_ThrowNullPointerException(env, NULL);
        return;
    }

    if (outOfBounds(env, off, len, bytes)) {
        JNU_ThrowByName(env, "java/lang/IndexOutOfBoundsException", NULL);
        return;
    }

    if (len == 0) {
        return;
    } else if (len > BUF_SIZE) {
        buf = malloc(len);
        if (buf == NULL) {
            JNU_ThrowOutOfMemoryError(env, NULL);
            return;
        }
    } else {
        buf = stackBuf;
    }

    (*env)->GetByteArrayRegion(env, bytes, off, len, (jbyte *)buf);

    if (!(*env)->ExceptionOccurred(env)) {
        off = 0;
        while (len > 0) {
            fd = GET_FD(this, fid);
            if (fd == -1) {
                JNU_ThrowIOException(env, "Stream Closed");
                break;
            }
            n = IO_Write(fd, buf+off, len);
            if (n == JVM_IO_ERR) {
                JNU_ThrowIOExceptionWithLastError(env, "Write error");
                break;
            } else if (n == JVM_IO_INTR) {
                JNU_ThrowByName(env, "java/io/InterruptedIOException", NULL);
                break;
            }
            off += n;
            len -= n;
        }
    }
    if (buf != stackBuf) {
        free(buf);
    }
}

The writing here is done by a method called IO_Write. At this point, what happens next becomes platform dependent, as IO_Write is defined differently for Windows and Linux.

The Linux Way

The linux way of handling IO uses the HPI (Hardware Platform Interface). Thus, the method is defined as JVM_Write in io_util_md.h:

#define IO_Write JVM_Write

The code form JVM_Write is defined in the JVM itself. The code is not Java, nor C, but it’s C++. The method can be found in jvm.cpp:

JVM_LEAF(jint, JVM_Write(jint fd, char *buf, jint nbytes))
  JVMWrapper2("JVM_Write (0x%x)", fd);

  //%note jvm_r6
  return (jint)hpi::write(fd, buf, nbytes);
JVM_END

The writing is now done by various HPI methods. Although you could go further, I’m going to stop here, since we’re now so far from where we started.

The Way of Windows

In Windows, the method IO_Write is routed away from the HPI layer. Instead, it’s redefined as handleWrite in io_util_md.h.

The implementation for handleWrite() is defined in io_util_md.c:

JNIEXPORT
size_t
handleWrite(jlong fd, const void *buf, jint len)
{
    BOOL result = 0;
    DWORD written = 0;
    HANDLE h = (HANDLE)fd;
    if (h != INVALID_HANDLE_VALUE) {
        result = WriteFile(h,           /* File handle to write */
                      buf,              /* pointers to the buffers */
                      len,              /* number of bytes to write */
                      &written,         /* receives number of bytes written */
                      NULL);            /* no overlapped struct */
    }
    if ((h == INVALID_HANDLE_VALUE) || (result == 0)) {
        return -1;
    }
    return written;
}

The WriteFile function is in the Windows API. The Windows API is not open source, so we would have to stop here.

Conclusion

We’ve taken a tour through the entire call stack of System.out.println(), from the instantiation of System.out to the path through java.io, all the way down to JNI and HPI level IO handling.

We haven’t even reached the end. No doubt there’s dozens more levels underneath the HPI layer and the WriteFile API call.

Perhaps the better answer to the question, “how does System.out.println() work” would be “it’s magic”.

In other news

I’ve received the Gold Standard (~top 4%) in the Galois contest earlier this year. Additionally I’ve placed in Group II (~6th place overall) in the Euclid contest with a book prize.

10 Responses to How System.out.println() really works

  1. Jon Page says:

    I’m sure it’s all very good, but I was warned off when you said it was very long. (Thanks for that :-) )

    All I want to know is where does the system.out.println come out? Without this it’s useless, but I can’t find any mention anywhere other than “the log”. Which log? What’s it called, where is it?

    Similarly, how do I divert it to a log that I CAN find?

    It’s the great mystery of system.out.println…

  2. Anonymous says:

    I always wondered how this thing works. At a high level, I always knew that there should be some buffering and some basic synchronization, but it’s amazing to see the hand off between Java, JNI, C++, this is a great article.

  3. data stream says:

    data stream…

    [...]How System.out.println() really works « Lucky's Notes[...]…

  4. CLA 4.2 says:

    CLA 4.2…

    [...]How System.out.println() really works « Lucky's Notes[...]…

  5. Edward Gelernt says:

    If there is a BufferedWriter, why is there a BufferedOutputStream at a lower level? What is the need for two buffered components?

  6. Edward Gelernt says:

    Also, one small error that I caught:
    In StreamEncoder.writeBytes(), the method out.write() is called. You said that the out object that is being used is a BufferedOutputStream. However, the out object is actually the PrintStream itself. The function PrintStream.write(byte[],int,int) is called, which then in turn calls the BufferedOutputStream.write(byte[],int,int) method.
    Again, just a minor mistake. The article is great though!

  7. Gladwin says:

    Reblogged this on Our Tech Web and commented:
    test

  8. I’m excited to find this page. I wanted to thank you for
    ones time due to this fantastic read!! I definitely really liked every part of it and
    I have you book marked to look at new information in your website.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 61 other followers

%d bloggers like this: