A Brief Introduction To Java IO

Input and output, or I/O for short, are probably the most underappreciated topics in computer science. Most programmers take them for granted. Developers use I/O on a daily basis without realizing how interesting it can be. If you want to be a developer, you will almost always write programs that require input or produce output. So as a Java programmer, you should find I/O interesting. In this tutorial, we will try to improve your understanding of Java I/O. Don’t worry if you don’t understand everything the first time. We will explain to them when the need arises.

Java’s core API includes a particularly rich set of I/O classes, mostly in the java.io and java.nio packages. These packages support a variety of I/O styles. One distinction is between byte-oriented I/O (handled by input and output streams) and character-oriented I/O (handled by readers and writers). Another distinction is made between old-fashioned stream-based I/O and new-style channel- and buffer-based I/O. All of these are appropriate for various needs and use cases. None of them should be overlooked.

The articles in this Java Intermediate series are listed below.

Concept of Java IO

Java’s I/O libraries are designed in such a way that you can read from external data sources and write to external targets regardless of what you’re writing to or reading from. Abstraction shows its true power while working with Java’s I/O classes. When reading from a file, you use the same methods as when reading from the console or a network connection. To write to a file, use the same methods as you would to write to a byte array or a serial port device.

Reading and writing without regard for where your data is coming from or going is a powerful abstraction. This allows you to define I/O streams that automatically compress, encrypt, and filter data from one format to another. With these tools, programs can send encrypted data or zip files with little to no knowledge of what they’re doing. Cryptography and compression can be isolated in a few lines of code that say, “Oh yes, make this a compressed, encrypted output stream.”

In simple words, I/O is the means by which software communicates with the outside world. Java provides a powerful and adaptable set of tools for performing this critical task. Having said that, let us begin with the fundamentals.

What Is A Stream?

An I/O Stream represents either an input “source” or an output “destination”. A stream can represent a wide range of different sources and destinations, such as disk files, devices, other programs, and memory arrays.

Streams can handle a wide range of data types, including simple bytes, primitive data types, localized characters, and objects. Some streams simply transmit data, while others manipulate and transform it in useful ways.

Whatever their internal workings, all streams present the same simple model to programs that use them: A stream is a sequence of data. 

Input streams transfer bytes of data from an external source into a Java program. Output streams transport bytes of data from a program to an external target. In some cases, streams can also be used to transfer bytes from one part of a Java program to another.

An input stream is used by a program to read data from a source one “item” at a time:

NHnhIxGHeG3Qe4nTY OeoMJKP6kG8w1xp P8gq5tZKl21EcP9zaTGIUXGTSoFPQL8fB3olHyAlUSOxPesFXDLIUEJm3gn8nCZEJAX7BDbte ik6cPjE woLUGSN0IbjAXPTKijSBwh nBwMZ MF9doG9CcNa0uMG9rLcT3ie6C 8Ww1hKCUPE5ACJA - A Brief Introduction To Java IO

A program uses an output stream to write data to a destination, one “item” at time:

JguCq - A Brief Introduction To Java IO

The term stream is derived from a comparison of a sequence to a stream of water.

An input stream is analogous to a siphon sucking up water. On the other hand, if the siphon is drawing water from a river, it may well operate indefinitely.

z1hBA7vC3ph2Bc60AkCTZZ6OFEUP AGvk3G369 klcZ52Ipw7WyaJ7emja Z5y8EOsZ5ggnw ZBB4Ip02h0owaW H54Fz1RjH6WrMNSO89J1Mi72DIWj8Ods 5YQrN96Ko pvGE2j9QANE98LEDJ4qCQCpBeJB8jFcfljm cehMED1l8N7g1X3Ns4w - A Brief Introduction To Java IO

An output stream is like a hose that sprays out water.

5Cra98OFBK3rhDrL9 5Qq1K Lgf LkG1n7btiVivkXc1UNREROmwez8Q1Kcil1V3AoZNs5zJ wz5TOVRFe27YoemmJjEAdSiffe0 - A Brief Introduction To Java IO

Siphons can be linked to hoses to transport water from one location to another. If a siphon is drawing from a limited source, such as a bucket, it may occasionally run out of water. An input stream may also read from a finite source of bytes, such as a file, or from an infinite source of bytes, such as System.in. Similarly, an output stream can output a fixed number of bytes or an indefinite number of bytes.

A Java program can receive input from a variety of sources. Output can be routed to a variety of destinations. 

The power of the stream metaphor is that it abstracts away the differences between these sources and destinations. All input and output operations are simply treated as streams, employing the same classes and methods. You do not need to learn a new API for each type of device. The same API that can read files can also read network sockets, serial ports, Bluetooth transmissions, and other types of data.

Where Do Streams Come From?

Most programmers’ first point of contact with input is System.in. This usually refers to a console window, most likely the one in which the Java program was launched. If the input is redirected so that the program reads from a file, System.in is also modified. 

The static field out in java.lang.System class additionally allows for output to the console. There is also an error field in this class. The most frequent application of this is for debugging and reporting error messages from catch clauses.

    try {
        //... do something that might throw an exception
    }catch (Exception ex) {
        System.err.println(ex);
    }

Both System.out and System.err are print streams, that is, instances of java.io.PrintStream.

Files are another common source of input and output destination. File input streams deliver a data stream that begins with the first byte of a file and ends with the last byte of that file. File output streams write data into a file, either by erasing the file’s contents and beginning again, or by appending data to the file.

Streams are also provided by network connections. When you connect to a web server, FTP server, or other type of server, you read data from an input stream connected to that server and write data to an output stream connected to that server.

Streams can also be generated by Java programs. Data is moved from one part of a Java program to another using byte array input streams, byte array output streams, piped input streams, and piped output streams.

The Stream Classes

Most of the classes that work directly with streams are part of the java.io package. The two main classes are java.io.InputStream and java.io.OutputStream. These are abstract base classes for many different subclasses with more specialized abilities.

    The subclasses include:

        BufferedInputStream
        BufferedOutputStream
        ByteArrayInputStream
        ByteArrayOutputStream
        DataInputStream
        DataOutputStream
        FileInputStream
        FileOutputStream
        FilterInputStream
        FilterOutputStream
        ObjectInputStream
        ObjectOutputStream
        PipedInputStream
        PipedOutputStream
        PrintStream
        PushbackInputStream
        SequenceInputStream

What Is A Byte Stream?

Byte streams are a convenient way to handle byte input and output. Byte streams are used to read and write binary data, for example.

Two class hierarchies are used to define byte streams. There are two abstract classes at the top: InputStream and OutputStream. Each of these abstract classes has several concrete subclasses that deal with the differences between different devices such as disk files, network connections, and even memory buffers.

9o Tfus - A Brief Introduction To Java IO

To input and output 8-bit bytes, programs use byte streams. All byte stream classes are descended from InputStream and OutputStream.

There are numerous classes of byte streams. We’ll look at the file I/O byte streams – FileInputStream and FileOutputStream, to see how they work. Other types of byte streams are used in much the same way; the main difference is how they are built.

How To Use Byte Streams?

We’ll learn about FileInputStream and FileOutputStream by looking at ByteStreamDemo, a program that uses byte streams to copy text.txt ,one byte at a time.

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
 
public class ByteStreamDemo {
    public static void main(String[] args) throws IOException {
 
        FileInputStream in = null;
        FileOutputStream out = null;
 
        try {
            in = new FileInputStream("text.txt");
            out = new FileOutputStream("out.txt");
            int c;
 
            while ((c = in.read()) != -1) {
                out.write(c);
            }
        } finally {
            if (in != null) {
                in.close();
            }
            if (out != null) {
                out.close();
            }
        }
    }
}

As shown in the figure below, ByteStreamDemo spends the majority of its time in a simple loop that reads the input stream and writes the output stream one byte at a time.

0oFnqO3vYQOw219edF msrvVtgjllPot07CgkqXFsf14 wcu 5uoKnTLYAFTwJCPt07WW f6 iSs9qsicdrnuJ frHia1OR2Kgju5q x503Wy5NPZJfM64RpY6qiI47mBigSRQmburDc0 - A Brief Introduction To Java IO

Always Close Streams

Closing a stream when it is no longer required is critical; in fact, ByteStreamDemo employs a finally block to ensure that both streams are closed even if an error occurs. This practice aids in the prevention of major resource leaks.

One possible cause of the error is that ByteStreamDemo was unable to open either one or both files. When this happens, the file’s stream variable never changes from its initial null value. That is why, before invoking close, ByteStreamDemo checks to ensure that each stream variable contains an object reference.

When Not To Use Byte Streams

ByteStreamDemo appears to be a normal program, but it represents a type of low-level I/O that should be avoided. Because text.txt contains character data, using character streams, as discussed in the following section, is the best approach. Streams are also available for more complex data types. Byte streams should only be used for the most basic I/O operations.

So, why bring up byte streams? Because all other types of streams are based on byte streams.

What Is A Character Stream?

Numbers are only one type of data that a typical Java program requires in order to read and write. Many programs also deal with text, which is made up of characters. Because computers can only understand numbers, characters are encoded by assigning a number to each character in a given script. For instance, in the common  ASCII encoding, character A corresponds to the number 65, while character B corresponds to the number 66.  The number 67 is assigned to the character C, and so on. Encodings can encode different scripts or the same or similar scripts in different ways.

Text in Java is primarily made up of the char primitive data type, char arrays, and Strings(which are internally stored as arrays of chars). Just as understanding bytes is required to fully comprehend how input and output streams operate, understanding chars is required to fully comprehend how readers and writers operate.

A char in Java is a 2-byte unsigned integer, the only unsigned type in the language. As a result, the possible char values range from 0 to 65,535. Each character in the Unicode character set is represented by a different char. In this range, chars can be assigned using int literals; for example:

char copyright = 169;

chars may also be assigned to by using char literals that is, the character itself enclosed in single quotes:

char copyright = '©';

Java performs I/O via streams, which are of two types:

        Byte Stream

        Character Stream

Byte Streams are a simple way to handle input and output of byte.

Character Streams are intended to read and write data from and to a stream of characters. Because of the various file encoding systems, we require this specialized stream.

What Is The Point Of Character Streams?

Streams are designed primarily for data that can be read as pure bytes. Essentially, byte and numeric data encoded as binary numbers of some kind. Streams are not intended for reading or writing text, including ASCII text like “Hello World” and numbers formatted as text like “3.1415929”.

We don’t provide any encoding scheme in Byte Streams. So if the file isn’t saved in an 8-bit character set (like ASCII or another) format, we won’t get a proper output because it will treat each byte differently.  For example, if a file uses the UNICODE encoding system, i.e., 2 bytes for a single character, the byte stream will break it into two parts To process it properly we have to apply the programming logic.

Because Java traditionally uses the UNICODE system to store characters, we use Character Streams, which automatically translate to and from the local character set.

The following are the main classes (abstract) associated with character streams:

Reader

Writer

Readers And Writers

Fundamentally, input and output streams are based on bytes. Characters, whose widths might vary depending on the character set, are the foundation for both Readers and Writers. For instance, 1-byte characters are used in Latin-1 and ASCII. Characters in UTF-32 are 4 bytes each. UTF-8 employs characters with different widths (between one and four bytes). Readers get their information from streams since characters are ultimately made up of bytes. But before sending them on, they translate those bytes into characters using a predetermined encoding system. Similarly, before writing to an underlying stream, writers convert chars to bytes using a specified encoding.

The java.io.Reader and java.io.Writer classes are abstract superclasses for character-based data readers and writers. The subclasses are notable for their handling of character set conversion. The core Java API includes nine reader and eight writer classes, all in the java.io package:

BufferedReader
BufferedWriter
CharArrayReader
CharArrayWriter
FileReader
FileWriter
FilterReader
FilterWriter
InputStreamReader
LineNumberReader
OutputStreamWriter
PipedReader
PipedWriter
PrintWriter
PushbackReader
StringReader
StringWriter

These classes, for the most part, have methods that are very similar to the equivalent stream classes. Often, the only difference is that a byte in a stream method’s signature becomes a char in the matching reader or writer method’s signature.

For example, the java.io.OutputStream class declares these three write( ) methods:

    public abstract void write(int i) throws IOException
    public void write(byte[] data) throws IOException
    public void write(byte[] data, int offset, int length) throws IOException

The java.io.Writer class, therefore, declares these three write( ) methods:

    public void write(int i) throws IOException
    public void write(char[] data) throws IOException
    public abstract void write(char[] data, int offset, int length) throws IOException

As you can see, the signatures are identical, with the exception that the byte array data in the latter two methods has been replaced with a char array. There is also a less obvious difference that is not reflected in the signature. While the int passed to the OutputStream write() method is modulo 256 before being output, the int passed to the Writer write() method is modulo 65,536 before being output. This reflects the various char and byte ranges.

How To Use Character Streams?

As mentioned earlier, Reader and Writer are the superclasses of all character stream classes. They are abstract classes. So we can not instantiate those classes directly. FileReader and FileWriter are the classes that we can create objects of. These classes specialize in the File I/O. The following CharacterStreamDemo program illustrates how to use these classes.

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class CharacterStreamDemo {
    public static void main(String[] args) throws IOException {

        FileReader inputCharacterStream = null;
        FileWriter outputCharacterStream = null;

        try {
            inputCharacterStream = new FileReader("text.txt");
            outputCharacterStream = new FileWriter("outputcharacter.txt");

            int character;
            while ((character = inputCharacterStream.read()) != -1) {
                outputCharacterStream.write(character);
            }
        } finally {
            if (inputCharacterStream != null) {
                inputCharacterStream.close();
            }
            if (outputCharacterStream != null) {
                outputCharacterStream.close();
            }
        }
    }
}

Output

Look at the current file directory. You will see a new text file named "outputcharacter.txt".

Both ByteStreamDemo and CharacterStreamDemo are very similar to each other. But there is an important difference between them. CharacterStreamDemo uses FileReader and FileWriter, whereas ByteStreamDemo uses FileInputStream and FileOutputStream.

You may be confused when you notice that both programs use an int variable to read to and write from. The difference is that in CharacterStreamDemo a character value is held in the int variable in its last 16 bits. But in the ByteStreamDemo  a byte value is held in the int variable in its last 8 bits.

Character streams are frequently used as “wrappers” for byte streams. The byte stream performs physical I/O for the character stream, while the character stream handles the character-to-byte translation. FileReader, for example, makes use of FileInputStream, whereas FileWriter makes use of FileOutputStream.

InputStreamReader and OutputStreamWriter are two general-purpose byte-to-character “bridge” streams. When there are no prepackaged character stream classes that meet your needs, use them to create your own. In our discussion of networking sockets, the lesson demonstrates how to create character streams from the byte streams provided by socket classes.

How To Implement Line-Oriented IO?

Character I/O usually occurs in bigger units than single characters.  A line is a common unit: a string of characters with a line terminator at the end. A carriage-return/line-feed sequence (“\r\n”), a single carriage-return (“\r”), or a single line-feed (“\n”) can be used as a line terminator. Programs can read text files created on any of the widely used operating systems because they support all possible line terminators.

Let us change the CharacterStreamDemo example to use line-oriented I/O. To accomplish this, we must employ two new classes: BufferedReader and PrintWriter. We will discuss them in great detail in a later tutorial. For the time being, we’re only interested in their support for line-oriented I/O.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.PrintWriter;

public class CharacterStreamDemo {
    public static void main(String[] args) throws IOException {

        BufferedReader inputCharacterStream = null;
        PrintWriter outputCharacterStream = null;

        try {
            inputCharacterStream = new BufferedReader(new FileReader("text.txt"));
            outputCharacterStream = new PrintWriter(new FileWriter("outputcharacter2.txt"));

            String line;
            while ((line = inputCharacterStream.readLine()) != null) {
                outputCharacterStream.println(line);
            }
        } finally {
            if (inputCharacterStream != null) {
                inputCharacterStream.close();
            }
            if (outputCharacterStream != null) {
                outputCharacterStream.close();
            }
        }
    }
}

Output

Look at the current file directory. You will see a new text file named “outputcharacter2.txt”.

What Is A Buffered Stream?

A buffer is a memory location that aids in reading from an input device and writing to an output device.

Assume you want to save bytes to a file. You can write each byte to the file immediately, requesting the kernel repeatedly. If you need to write 1000 bytes, you’ll need to make 1000 kernel requests. This adds to the overhead.

Assume we have a byte array of size 10, for example. We first write our bytes to this array, and when it is full, we write the entire array to the file at once, using only one kernel request. So, how many requests did we send in order to write 1000 bytes? 100, a factor of ten reductions This is your buffer array.

It stores data until it is full, then writes those data elements to the output all at once.

A similar process occurs during read operations. This I/O process is handled automatically by the Buffer class.

The majority of the examples we’ve seen so far make use of unbuffered I/O. This means that the underlying operating system handles each read or write request directly. This can make a program much less efficient because each such request frequently triggers disk access, network activity, or some other relatively expensive operation.

Buffering I/O is a common performance enhancement. To achieve this performance improvement, use Java’s BufferedInputStream class to “wrap” any InputStream into a buffered stream.

The Java platform uses buffered I/O streams to reduce this kind of overhead. Buffered input streams read data from a buffered memory area; the native input API is called only when the buffer is empty. Similarly, buffered output streams write data to a buffer and only call the native output API when the buffer is full. A program can convert an unbuffered stream to a buffered stream by using the wrapping idiom, which involves passing the unbuffered stream object to the constructor of a buffered stream class. To use buffered I/O, modify the constructor invocations in the CharacterStreamDemo example as follows:

    inputCharacterStream = new BufferedReader(new FileReader("text.txt"));
    outputCharacterStream = new PrintWriter(new FileWriter("outputcharacter2.txt"));

There are four buffered stream classes used to wrap unbuffered streams: BufferedInputStream and BufferedOutputStream create buffered byte streams, while BufferedReader and BufferedWriter create buffered character streams.

Flushing Buffered Streams

It is frequently advantageous to write out a buffer at critical points rather than waiting for it to fill. This is referred to as flushing the buffer.

Some buffered output classes support autoflush, which can be enabled using an optional constructor argument. When autoflush is enabled, certain key events flush the buffer. For example, an autoflush PrintWriter object flushes the buffer with each println or format call. 

To manually flush a stream, use its flush method. The flush method works on any output stream but has no effect unless it is buffered.

Programming I/O frequently entails translating to and from the neatly formatted data that humans prefer to work with. The Java platform provides two APIs to help you with these tasks. The scanner API divides input into individual tokens that correspond to bits of data. The formatting API formats data so that it is easily readable by humans.

“Scanner” is another Java class that allows the user to read different types. This “Scanner” class is a simple text scanner that can use regular expressions to parse primitive types and strings. In general, it divides the input into tokens using a delimiter pattern, which is blank space by default.

Look at the following example to understand how a Scanner works.

import java.io.*;
import java.util.Scanner;

public class ScannerDemo {
    public static void main(String[] args) throws IOException {

        Scanner scanner = null;

        try {
            scanner = new Scanner(new BufferedReader(new FileReader("text.txt")));

            while (scanner.hasNext()) {
                System.out.println(scanner.next());
            }
        } finally {
            if (scanner != null) {
                scanner.close();
            }
        }
    }
}

Output

Java
is
a
high-level,
class-based,
object-oriented
programming
language
that
is
designed
to
have
as
few
implementation
dependencies
as
possible.

When ScannerDemo is finished with the scanner object, it calls Scanner’s close method. Even though a scanner is not a stream, you must close it to indicate that you have completed your work with its underlying stream.

In order to use a different token separator, call useDelimiter() with a regular expression. Assume you wanted the token separator to be a comma, possibly followed by white space.

scanner.useDelimiter(",\\s*");

Translating Individual Tokens

All input tokens are treated as simple String values in the ScannerDemo example. Tokens are also supported for all Java primitive types (except char), as well as BigInteger and BigDecimal. Numeric values can also use thousands of separators. Thus, in a US locale, Scanner correctly interprets the string “32,767” as an integer value.

We must mention the locale because thousands of separators and decimal symbols differ by locale. As a result, if we did not specify that the scanner should use the US locale, the following example would not work correctly in all locales. You usually don’t have to worry about this because your input data comes from sources that use the same locale as you. However, this example is part of the Java Tutorial and is widely distributed around the world.

The following example reads a list of double values and adds them up. Here’s the source:

import java.io.FileReader;
import java.io.BufferedReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.Locale;
 
public class ScannerSumDemo {
    public static void main(String[] args) throws IOException {
 
        Scanner scanner = null;
        double sum = 0;
 
        try {
            scanner = new Scanner(new BufferedReader(new FileReader("usnumbers.txt")));
            scanner.useLocale(Locale.US);
 
            while (scanner.hasNext()) {
                if (scanner.hasNextDouble()) {
                    sum += scanner.nextDouble();
                } else {
                    scanner.next();
                }  
            }
        } finally {
            scanner.close();
        }
 
        System.out.println(sum);
    }
}

And here’s the sample input file, usnumbers.txt

8.5
32,767
3.14159
1,000,000.1

The result is “1032778.74159”. Because System.out is a PrintStream object, and that class does not provide a way to override the default locale, the period will be a different character in some locales. We could change the locale for the entire program, or we could simply use formatting, as described in the following topic, Formatting.

Share The Tutorial With Your Friends
Twiter
Facebook
LinkedIn
Email
WhatsApp
Skype
Reddit

Check Our Ebook for This Online Course

Advanced topics are covered in this ebook with many practical examples.