Skip to content

Instantly share code, notes, and snippets.

@tc
Created August 26, 2010 22:01
Show Gist options
  • Save tc/552341 to your computer and use it in GitHub Desktop.
Save tc/552341 to your computer and use it in GitHub Desktop.
package opennlp.maxent.io;
///////////////////////////////////////////////////////////////////////////////
// Copyright (C) 2001 Jason Baldridge and Gann Bierner
//
// This library is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 2.1 of the License, or (at your option) any later version.
//
// This library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
//////////////////////////////////////////////////////////////////////////////
import java.io.*;
import java.util.zip.*;
/**
* A reader for GIS models which inspects the filename and invokes the
* appropriate GISModelReader depending on the filename's suffixes.
* <p/>
* <p>The following assumption are made about suffixes:
* <li>.gz --> the file is gzipped (must be the last suffix)
* <li>.txt --> the file is plain text
* <li>.bin --> the file is binary
*
* @author Jason Baldridge
* @version $Revision: 1.3.2.1 $, $Date: 2008/11/28 13:55:35 $
*/
public class SuffixSensitiveGISModelReader extends GISModelReader {
private final GISModelReader suffixAppropriateReader;
/**
* Constructor which takes a File and invokes the GISModelReader
* appropriate for the suffix.
*
* @param f The File in which the model is stored.
*/
public SuffixSensitiveGISModelReader(File f) throws IOException {
this(f.getName(), new FileInputStream(f));
}
/**
* Constructor which takes an InputStream and invokes the GISModelReader
* appropriate for the suffix. This allows you to read in models from the resources within a jar.
* @param filename
* @param input
*/
public SuffixSensitiveGISModelReader(String filename, InputStream input) throws IOException {
// handle the zipped/not zipped distinction
if (filename.endsWith(".gz")) {
input = new BufferedInputStream(new GZIPInputStream(input));
filename = filename.substring(0, filename.length() - 3);
}
// handle the different formats
if (filename.endsWith(".bin")) {
suffixAppropriateReader =
new BinaryGISModelReader(new DataInputStream(input));
}
// add more else ifs here to add further Reader types, e.g.
// else if (filename.endsWith(".xml"))
// suffixAppropriateReader = new XmlGISModelReader(input);
// of course, a BufferedReader may not be what is wanted here,
// so you might have to do a bit more to get
// SuffixSensitiveGISModelReader to work for xml or other formats.
// However, the default should be plain text (.txt).
else { // filename ends with ".txt"
suffixAppropriateReader =
new PlainTextGISModelReader(
new BufferedReader(new InputStreamReader(input)));
}
}
// activate this if adding another type of reader which can't read model
// information in the way that the default getModel() method in
// GISModelReader does.
//public GISModel getModel () throws java.io.IOException {
// return suffixAppropriateReader.getModel();
//}
protected int readInt() throws IOException {
return suffixAppropriateReader.readInt();
}
protected double readDouble() throws IOException {
return suffixAppropriateReader.readDouble();
}
protected String readUTF() throws IOException {
return suffixAppropriateReader.readUTF();
}
/**
* To convert between different formats of the new style.
* <p/>
* <p>java opennlp.maxent.io.SuffixSensitiveGISModelReader old_model_name new_model_name
* <p/>
* <p>For example, to convert a model called "model.bin.gz" (which is thus
* saved in gzipped binary format) to one in (unzipped) text format:
* <p/>
* <p>java opennlp.maxent.io.SuffixSensitiveGISModelReader model.bin.gz model.txt
* <p/>
* <p>This particular example would of course be useful when you generally
* want to create models which take up less space (.bin.gz), but want to
* be able to inspect a few of them as plain text files.
*/
public static void main(String[] args) throws IOException {
if (args.length == 2) {
opennlp.maxent.GISModel m =
new SuffixSensitiveGISModelReader(new File(args[0])).getModel();
new SuffixSensitiveGISModelWriter(m, new File(args[1])).persist();
} else {
System.err.println("Usage: SuffixSensitiveGISModelReader model1 mode2");
System.err.println("Load model1 and converts it into the model file format specified by the model2 name.");
}
}
}
@tc
Copy link
Author

tc commented Sep 9, 2010

modification of SuffixSensitiveGISModelReader to read input streams instead of File objects. This lets you read files from resource bundles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment