Deflate data in Java with deflater class

Subscribe to my newsletter and never miss my upcoming articles

The Deflater class contains methods to compress blocks of data. You can choose the compression format, the level of compression, and the compression strategy. There are nine steps to deflating data with the Deflater class:

  1. Construct a Deflater object.
  2. Choose the strategy (optional).
  3. Set the compression level (optional).
  4. Preset the dictionary (optional).
  5. Set the input.
  6. Deflate the data repeatedly until needsInput() returns true.
  7. If more input is available, go back to step 5 to provide additional input data. Otherwise, go to step 8.
  8. Finish the data.
  9. If there are more streams to be deflated, reset the deflater.

1. Construct a Deflater object There are three Deflater() constructors:

public Deflater(int level,boolean useGzip)

public Deflater(int level)

public Deflater()

2. Choose a strategy

The first step is to choose the strategy. Java 1.1 supports three strategies: filtered, Huffman, and default. These are represented by the mnemonic constants Deflater.FILTERED

Deflater.HUFFMAN_ONLY

Deflater.DEFAULT_STRATEGY

respectively. The setStrategy() method chooses one of these strategies. public static final int DEFAULT_STRATEGY = 0;

public static final int FILTERED = 1;

public static final int HUFFMAN_ONLY = 2;

public synchronized void setStrategy(int strategy)

This method throws an IllegalArgumentException if an unrecognized strategy is passed as an argument. If no strategy is chosen explicitly, then the default strategy is used.

3. Set the compression level

The deflater compresses by trying to match the data it's looking at now to data it's already seen earlier in the stream. The compression level determines how far back in the stream the deflater looks for a match.

public synchronized void setLevel(int Level) As with the Deflater() constructors, the compression level should be an int between and 9 (no compression to maximum compression) or perhaps -1, signifying the default compression level. Any other value will cause an IllegalArgumentException. It's good coding style to use one of the mnemonic constants Deflater.NO_COMPRESSION (0), Deflater.BEST_SPEED (1), Deflater.BEST_COMPRESSION (9), or Deflater.DEFAULT_COMPRESSION (-1) instead of an explicit value.

4. Set the dictionary

You can think of the deflater as building a dictionary of phrases as it reads the text. The first time it sees a phrase, it puts the phrase in the dictionary. The second time it sees the phrase, it replaces the phrase with its position in the dictionary. However, it can't do this until it's seen the phrase at least once, so data early in the stream isn't compressed very well compared to data that occurs later in the stream. On rare occasion, when you have a good idea that certain byte sequences appear in the data very frequently, you can preset the dictionary used for compression. You would fill the dictionary with the frequently repeated data in the text. For instance, if your text is composed completely of ASCII digits and assorted whitespace (tabs, carriage returns, and so forth) you could put those characters in your dictionary. This allows the early part of the stream to compress as well as later parts.

There are two setDictionary() methods. The first uses the entire byte array passed as an argument as the dictionary. The second uses the subarray of data starting at offset and continuing for length bytes. public void setDictionary(byte[] data)

public native synchronized void setDictionary(byte[] data, int offset, int length)

5. Set the input Next you must set the input data to be deflated with one of the setInput() methods:

public void setInput(byte[] input)

public synchronized void setInput(byte[] input, int offset, int length)

The first method prepares the entire array to be deflated. The second method prepares the specified subarray of data starting at offset and continuing for length bytes.

6. Deflate the data repeatedly until needsInput( ) returns true

Finally, you're ready to deflate the data. Once setInput() has filled the input buffer with data, it is deflated through one of two deflate() methods:

public int deflate(byte[] output)

public native synchronized int deflate(byte[] output, int offset, int length)

The first method fills the specified output array with the bytes of compressed data. The second fills the specified subarray of output beginning at offset and continuing for length bytes with the compressed data. Both methods return the actual number of compressed bytes written into the array. You do not know in advance how many compressed bytes will actually be written into output, because you do not know how well the data will compress. You always have to check the return value. If deflate() returns 0, you should check needsInput() to see if you need to call setInput() again to provide more uncompressed input data:

public boolean needsInput()

When more data is needed, the needsInput() method returns true. At this point you should invokesetInput() again to feed in more uncompressed input data, call deflate() , and repeat the process until deflate() returns and there is no more input data to be compressed. 7. Finish the deflation Finally, when the input data is exhausted, invokefinish() to indicate that no more data is forthcoming and the deflater should finish with the data it already has in its buffer: public synchronized void finish() The finished() method returns true when the end of the compressed output has been reached; that is, when all data stored in the input buffer has been deflated:

public synchronized boolean finished()

After calling finish(), you invoke deflate() repeatedly until finished() returns true. This flushes out any data that remains in the input buffer.

8. Reset the deflater and start over This completes the sequence of method invocations required to compress data. If you'd like to use the same strategy, compression level, and other settings to compress more data with the same Deflater, call its reset() method:

public native synchronized void reset()

Otherwise, call end() to throw away any unprocessed input and free the resources used by the native code:

public native synchronized void end()

The finalize() method calls end()before the deflater is garbage-collected, if you forget: protected void finalize()

EX :

import java.io.*; 
import java.util.zip.*; 
public class DirectDeflater { 
 public final static String DEFLATE_SUFFIX = ".dfl"; 
 public static void main(String[] args) { 

 Deflater def = new Deflater(); 
 byte[] input = new byte[1024]; 
 byte[] output = new byte[1024]; 
 for (int i = 0; i < args.length; i++) { 

 try { 
 FileInputStream fin = new FileInputStream(args[i]); 
 FileOutputStream fout = new FileOutputStream(args[i] + 
 DEFLATE_SUFFIX); 

 while (true) { // read and deflate the data 
 // Fill the input array. 
 int numRead = fin.read(input); 
 if (numRead == -1) { // end of stream 
 // Deflate any data that remains in the input buffer. 
 def.finish(); 
 while (!def.finished()) { 
 int numCompressedBytes = def.deflate(output, 0, 
 output.length); 
 if (numCompressedBytes > 0) { 
 fout.write(output, 0, numCompressedBytes); 
 } // end if 
 } // end while 
 break; // Exit while loop. 
 } // end if 
 else { // Deflate the input. 
 def.setInput(input, 0, numRead); 
 while (!def.needsInput()) { 
 int numCompressedBytes = def.deflate(output, 0, 
 output.length); 
 if (numCompressedBytes > 0) { 
 fout.write(output, 0, numCompressedBytes); 
 } // end if 
 } // end while 
 } // end else 
 } // end while 
 fin.close(); 
 fout.flush(); 
 fout.close(); 
 def.reset(); 
 } // end try 
 catch (IOException e) {System.err.println(e);} 
 } 
 } 
}

No Comments Yet