How to download a file via HTTP in Java

This article describes how to download a file from HTTP server. Java 6 or higher is required.
First I will show you two basic examples, which might work fine for smaller files, but for larger files (10+ MB) it will probably not work. The last example will describe good approach.

Basic example (not good)

Java provides java.nio.channels.FileChannel class to handle files (reading, writing). Its method transferFrom does the reading from the input and writing to the output file. Everything would be great, but the problem is on the remote side - in HTTP server; actually in the operating system.

When a server/operating system receives a request to download a file, it first stores the file in some buffer or cache. There is no guarantee that complete file will be cached - depends on the file size and how the operating system handles files.

FileChannel in Java will retrieve data from cache and if the file in cache is not complete, also Java cannot download complete file. This looks like the sever stops sending data, because no more data will be received.

// url=http://www.example.com/testFile.zip
// localFile=/path/to/testFile.zip
public void download(String url, String localFile) throws Exception {

System.out.println("Downloading " + localFile);

ReadableByteChannel in = Channels.newChannel(new URL(url).openStream());
FileOutputStream fos = new FileOutputStream(localFile);
FileChannel channel = fos.getChannel();
channel.transferFrom(in, 0, Long.MAX_VALUE);
channel.close();
fos.close();

System.out.println("Download complete");

}

In this example the FileChannel will try to read all data from ReadableByteChannel, starting from first byte (0) until the last byte in cache, but no more than maximum number of bytes (Long.MAX_VALUE).

So the problem is: if cache does not contain complete file, also Java cannot download complete file. For larger files this can occur quite often.

Basic example 2 (better, but still not good enough)

The next example is slightly better, because we added a check how many bytes were transferd and how many bytes is expected to be received and then loop until all data is received.

Now we will call transferFrom method in a loop and in each cycle we will transfer smaller chunks of data (65k) and print how much bytes has been transfered. At the end we will print the size of transfered file - it should be the same as expected size. But sooner or later the server will empty the cache and the Java will loop endlesly reading zero bytes from the input stream.

// url=http://www.example.com/testFile.zip
// localFile=/path/to/testFile.zip
public static void download(String url, String localFile) throws Exception {

System.out.println("Downloading " + localFile);
URL website = new URL(url);
URLConnection connection = website.openConnection();
ReadableByteChannel rbc = Channels.newChannel(connection.getInputStream());
FileOutputStream fos = new FileOutputStream(localFile);
long expectedSize = connection.getContentLength();
System.out.println("Expected size: " + expectedSize);
long transferedSize = 0L;
while (transferedSize < expectedSize) {
long delta = fos.getChannel().transferFrom(rbc, transferedSize, 1 << 16);
transferedSize += delta;
System.out.println(transferedSize + " bytes received");
}
fos.close();
System.out.println("Download complete");
System.out.println("Size: " + new File(localFile).length());

}

Good case

Now we know the amount of transfered bytes and the expected size and the size of file on the disc. So we need to tell the server to cache new data according to our needs.

First we'll check the size of file on the disk with method getFileSize() and send that value to the server as a 'Range' property. The server will then return the remaining size of file on the server with getContentLength() method. We compare both values to check if the file sizes on local disk and on the server are equal.
If they are not equal then we call the method transferFrom with arguments that tell the server to cache bytes in range from transferedSize to remainingSize (actually I used smaller 65k chunks of buffer).
So all we need to do is call the same method in a loop until all the bytes from the server are transfered to our local disk.

// url=http://www.example.com/testFile.zip
// localFile=/path/to/testFile.zip
public void download(String url, String localFile) throws Exception {

System.out.println("Downloading " + localFile);

boolean downloadComplete = false;

while (!downloadComplete) {
downloadComplete = transferData(url, localFile);
}

}



public boolean transferData(String url, String filename) throws Exception {

long transferedSize = getFileSize();

URL website = new URL(url);
URLConnection connection = website.openConnection();
connection.setRequestProperty("Range", "bytes="+transferedSize+"-");
ReadableByteChannel rbc = Channels.newChannel(connection.getInputStream());
long remainingSize = connection.getContentLength();
long buffer = remainingSize;
if (remainingSize > 65536) {
buffer = 1 << 16;
} System.out.println("Remaining size: " + remainingSize);

if (transferedSize == remainingSize) {
System.out.println("File is complete");
rbc.close();
return true;
}

FileOutputStream fos = new FileOutputStream(filename, true);

System.out.println("Continue downloading at " + transferedSize);
while (remainingSize > 0) {
long delta = fos.getChannel().transferFrom(rbc, transferedSize, buffer);
transferedSize += delta;
System.out.println(transferedSize + " bytes received");
if (delta == 0) {
break;
}
}
fos.close();
System.out.println("Download incomplete, retrying");

return false;

}



public long getFileSize() {
File f = new File(file);
System.out.println("Size: " + f.length());
return f.length();
}

This approach is simple and reliable and has many advantages
- you can easily calculate the percentage of transfered data and use it in progress bar or similar.
- even if the connection with the server is dropped during the transfer, the transfer can be resumed next time when you start the program (note the 'true' argument when creating new FileOutputStream - it means append data to existing file).