Strategy Patterns

Since this tutorial has been written the result has been broken out into a distinct gt-csv module. The work has also been forked into service as part of the GeoServer importer module. Originally in GeoServer’s fork, a nifty CSVStrategy was added (with implementations for CSVAttributesOnlyStrategy, CSVLatLonStrategy, CSVSpecifiedLatLngStrategy and CSVSpecifiedWKTStrategy). In this section we’ll discuss the new implementation which is a cleaning up and improvement on the strategy pattern they implemented, with read and write functionality. If you want to follow along with the code, it is available as an unsupported plugin:

CSVDataStore

CSVDataStore now uses a CSVStrategy and CSVFileState. The CSVFileState holds information about the file we are reading from. CSVStrategy is an abstract class for the strategy objects CSVDataStore can use.

public class CSVDataStore extends ContentDataStore implements FileDataStore {

    private final CSVStrategy csvStrategy;

    private final CSVFileState csvFileState;

    public CSVDataStore(CSVFileState csvFileState, CSVStrategy csvStrategy) {
        this.csvFileState = csvFileState;
        this.csvStrategy = csvStrategy;
    }

    // docs start getTypeName

Using the CSVFileState to do work for us, the getTypeName() method to create Name is much simpler.

    public Name getTypeName() {
        if (namespaceURI != null) {
            return new NameImpl(namespaceURI, csvFileState.getTypeName());
        } else {
            return new NameImpl(csvFileState.getTypeName());
        }
    }

CSVDataStore now implements the FileDataStore interface to ensure a standard for operations which are performed by FileDataStores. As such, it must override its methods. Note the use of the CSVStrategy in order to determine the schema. Depending on the strategy defined, the schema for this store will be different. The implementation of createFeatureSource() checks to make sure the file is writable before allowing the writing of features. If it is, it actually uses a CSVFeatureStore instead of a CSVFeatureSource, which is a data structure that will allow being written to as well as read from.

    @Override
    protected List<Name> createTypeNames() throws IOException {
        return Collections.singletonList(getTypeName());
    }

    @Override
    protected ContentFeatureSource createFeatureSource(ContentEntry entry) throws IOException {
        if (csvFileState.getFile().canWrite()) {
            return new CSVFeatureStore(csvStrategy, csvFileState, entry, Query.ALL);
        } else {
            return new CSVFeatureSource(entry, Query.ALL);
        }
    }

    @Override
    public SimpleFeatureType getSchema() throws IOException {
        return this.csvStrategy.getFeatureType();
    }

    @Override
    public void updateSchema(SimpleFeatureType featureType) throws IOException {
        throw new UnsupportedOperationException();
    }

    @Override
    public SimpleFeatureSource getFeatureSource() throws IOException {
        return new CSVFeatureSource(this);
    }

    @Override
    public FeatureReader<SimpleFeatureType, SimpleFeature> getFeatureReader() throws IOException {
        return new CSVFeatureSource(this).getReader();
    }

    @Override
    public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriter(
            Filter filter, Transaction transaction) throws IOException {
        return super.getFeatureWriter(this.csvFileState.getTypeName(), filter, transaction);
    }

    @Override
    public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriter(Transaction transaction)
            throws IOException {
        return super.getFeatureWriter(this.csvFileState.getTypeName(), transaction);
    }

    @Override
    public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriterAppend(
            Transaction transaction) throws IOException {
        return super.getFeatureWriterAppend(this.csvFileState.getTypeName(), transaction);
    }

    public CSVStrategy getCSVStrategy() {
        return csvStrategy;
    }

    @Override
    public void createSchema(SimpleFeatureType featureType) throws IOException {
        this.csvStrategy.createSchema(featureType);
    }

CSVDataStoreFactory

The new architecture with the added strategy objects expands the CSVDataStoreFactory’s capabilities. It contains a few more Param fields now. Much of the class’s structure is improved to be more compartmentalized. The metadata is mostly the same with some data now being held in class fields rather than literals.

import java.awt.RenderingHints.Key;
import java.io.File;
import java.io.IOException;
import java.io.Serializable;
import java.net.URI;
import java.net.URL;
import java.util.Collections;
import java.util.Map;
import org.apache.commons.io.FilenameUtils;
import org.geotools.data.DataStore;
import org.geotools.data.FileDataStore;
import org.geotools.data.FileDataStoreFactorySpi;
import org.geotools.data.csv.parse.CSVAttributesOnlyStrategy;
import org.geotools.data.csv.parse.CSVLatLonStrategy;
import org.geotools.data.csv.parse.CSVSpecifiedWKTStrategy;
import org.geotools.data.csv.parse.CSVStrategy;
import org.geotools.factory.CommonFactoryFinder;
import org.geotools.feature.type.FeatureTypeFactoryImpl;
import org.geotools.util.KVP;
import org.geotools.util.URLs;

public class CSVDataStoreFactory implements FileDataStoreFactorySpi {

    /** GUESS_STRATEGY */
    public static final String GUESS_STRATEGY = "guess";

    /** ATTRIBUTES_ONLY_STRATEGY */
    public static final String ATTRIBUTES_ONLY_STRATEGY = "AttributesOnly";

    /** SPECIFC_STRATEGY */
    public static final String SPECIFC_STRATEGY = "specify";

    /** WKT_STRATEGY */
    public static final String WKT_STRATEGY = "wkt";

    private static final String FILE_TYPE = "csv";

    public static final String[] EXTENSIONS = new String[] {"." + FILE_TYPE};

    public static final Param FILE_PARAM =
            new Param("file", File.class, FILE_TYPE + " file", false);

    public static final Param URL_PARAM = new Param("url", URL.class, FILE_TYPE + " file", false);

    public static final Param NAMESPACEP =
            new Param(
                    "namespace",
                    URI.class,
                    "uri to the namespace",
                    false,
                    null,
                    new KVP(Param.LEVEL, "advanced"));

    public static final Param STRATEGYP = new Param("strategy", String.class, "strategy", false);

    public static final Param LATFIELDP =
            new Param(
                    "latField",
                    String.class,
                    "Latitude field. Assumes a CSVSpecifiedLatLngStrategy",
                    false);

    public static final Param LnGFIELDP =
            new Param(
                    "lngField",
                    String.class,
                    "Longitude field. Assumes a CSVSpecifiedLatLngStrategy",
                    false);

    public static final Param WKTP =
            new Param(
                    "wktField",
                    String.class,
                    "WKT field. Assumes a CSVSpecifiedWKTStrategy",
                    false);

    public static final Param[] parametersInfo =
            new Param[] {FILE_PARAM, NAMESPACEP, STRATEGYP, LATFIELDP, LnGFIELDP, WKTP};

The method isAvailable() just attempts to read the class, and if it succeeds, returns true.

    @Override
    public boolean isAvailable() {
        try {
            CSVDataStore.class.getName();
        } catch (Exception e) {
            return false;
        }
        return true;
    }

The canProcess(Map<String, Serializable> params) method was made more tolerant, now accepting URL and File params through the fileFromParams(Map<String, Serializable> params) method. It will try File first, then URL before giving up.

    @Override
    public boolean canProcess(URL url) {
        return canProcessExtension(URLs.urlToFile(url).toString());
    }

Finally, the different strategies are implemented in the createDataStoreFromFile() method. The method is overloaded to make some parameters optional, which the class will then fill in for us.

    public FileDataStore createDataStoreFromFile(File file) throws IOException {
        return createDataStoreFromFile(file, null);
    }

    public FileDataStore createDataStoreFromFile(File file, URI namespace) throws IOException {
        if (file == null) {
            throw new IllegalArgumentException("Cannot create store from null file");
        } else if (!file.exists()) {
            throw new IllegalArgumentException("Cannot create store with file that does not exist");
        }
        Map<String, Serializable> noParams = Collections.emptyMap();
        return createDataStoreFromFile(file, namespace, noParams);
    }

CSVFeatureReader

The CSVFeatureReader now delegates much of the functionality to a new class called CSVIterator as well as the CSVStrategy. The resulting code is very clean and short.

public class CSVFeatureReader implements FeatureReader<SimpleFeatureType, SimpleFeature> {

    private SimpleFeatureType featureType;

    private CSVIterator iterator;

    public CSVFeatureReader(CSVStrategy csvStrategy) throws IOException {
        this(csvStrategy, Query.ALL);
    }

    public CSVFeatureReader(CSVStrategy csvStrategy, Query query) throws IOException {
        this.featureType = csvStrategy.getFeatureType();
        this.iterator = csvStrategy.iterator();
    }

    @Override
    public SimpleFeatureType getFeatureType() {
        return featureType;
    }

    @Override
    public void close() throws IOException {
        iterator.close();
    }

    @Override
    public SimpleFeature next()
            throws IOException, IllegalArgumentException, NoSuchElementException {
        return iterator.next();
    }

    @Override
    public boolean hasNext() throws IOException {
        return iterator.hasNext();
    }
}

CSVFeatureSource

CSVFeatureSource retains the same basic structure, but the code is assisted by the new classes. It now overloads the constructor:

@SuppressWarnings("unchecked")
public class CSVFeatureSource extends ContentFeatureSource {

    public CSVFeatureSource(CSVDataStore datastore) {
        this(datastore, Query.ALL);
    }

    public CSVFeatureSource(CSVDataStore datastore, Query query) {
        this(new ContentEntry(datastore, datastore.getTypeName()), query);
    }

    public CSVFeatureSource(ContentEntry entry) {
        this(entry, Query.ALL);
    }

    public CSVFeatureSource(ContentEntry entry, Query query) {
        super(entry, query);
    }

The getBoundsInternal(Query query) method is now implemented by making use of the methods provided by ContentFeatureSource. A new ReferencedEnvelope is created to store the bounds for this feature source. It uses the feature type (getSchema()) to determine the CRS (getCoordinateReferenceSystem()) - this information is used to construct the bounds for the feature. The FeatureReader is now created by using the Query and CSVStrategy - the getReader() method calls getReaderInternal(Query query) which shows how it is created. Finally, using the reader, the features are cycled through and included in the bounds in order to calculate the bounds for this entire datastore.

    protected ReferencedEnvelope getBoundsInternal(Query query) throws IOException {
        ReferencedEnvelope bounds =
                new ReferencedEnvelope(getSchema().getCoordinateReferenceSystem());
        FeatureReader<SimpleFeatureType, SimpleFeature> featureReader = getReader(query);
        try {
            while (featureReader.hasNext()) {
                SimpleFeature feature = featureReader.next();
                bounds.include(feature.getBounds());
            }
        } finally {
            featureReader.close();
        }
        return bounds;
    }

The getReaderInternal(Query query) method now utilizes the strategy of the CSVDataStore rather than state to reflect the changes to the CSVFeatureReader design.

    protected FeatureReader<SimpleFeatureType, SimpleFeature> getReaderInternal(Query query)
            throws IOException {
        CSVDataStore dataStore = getDataStore();
        return new CSVFeatureReader(dataStore.getCSVStrategy(), query);
    }

The getCountInternal(Query query) method uses the same idea as getBoundsInternal(Query query) - it now utilizes the Query and CSVStrategy to obtain a FeatureReader, then simply counts them.

    protected int getCountInternal(Query query) throws IOException {
        FeatureReader<SimpleFeatureType, SimpleFeature> featureReader = getReaderInternal(query);
        int n = 0;
        try {
            for (n = 0; featureReader.hasNext(); n++) {
                featureReader.next();
            }
        } finally {
            featureReader.close();
        }
        return n;
    }

The buildFeatureType() method is now very simple using getSchema() to grab the feature type of the datastore.

    protected SimpleFeatureType buildFeatureType() throws IOException {
        return getDataStore().getSchema();
    }

CSVFeatureStore

CSVFeatureStore essentially acts as a read/write version of CSVFeatureSource. Where CSVFeatureSource is only readable, CSVFeatureStore adds the ability to write through the use of a CSVFeatureWriter. The code is updated to use the strategy pattern which it must pass to the writer.

/**
 * Read-write access to CSV File.
 *
 * @author Jody Garnett (Boundless)
 * @author Ian Turton (Envitia)
 */
public class CSVFeatureStore extends ContentFeatureStore {
    private CSVStrategy csvStrategy;
    private CSVFileState csvFileState;

    public CSVFeatureStore(
            CSVStrategy csvStrategy, CSVFileState csvFileState, ContentEntry entry, Query query) {
        super(entry, query);

        this.csvStrategy = csvStrategy;
        this.csvFileState = csvFileState;
    }
    // header end
    // getWriter start
    //
    // CSVFeatureStore implementations
    //
    @Override
    protected FeatureWriter<SimpleFeatureType, SimpleFeature> getWriterInternal(
            Query query, int flags) throws IOException {
        return new CSVFeatureWriter(this.csvFileState, this.csvStrategy, query);
    }
    // getWriter end

    // transaction start
    /**
     * Delegate used for FeatureSource methods (We do this because Java cannot inherit from both
     * ContentFeatureStore and CSVFeatureSource at the same time
     */
    CSVFeatureSource delegate =
            new CSVFeatureSource(entry, query) {
                @Override
                public void setTransaction(Transaction transaction) {
                    super.setTransaction(transaction);
                    CSVFeatureStore.this.setTransaction(
                            transaction); // Keep these two implementations on the same transaction
                }
            };

    @Override
    public void setTransaction(Transaction transaction) {
        super.setTransaction(transaction);
        if (delegate.getTransaction() != transaction) {
            delegate.setTransaction(transaction);
        }
    }
    // transaction end

    // internal start
    //
    // Internal Delegate Methods
    // Implement FeatureSource methods using CSVFeatureSource implementation
    //
    @Override
    protected SimpleFeatureType buildFeatureType() throws IOException {
        return delegate.buildFeatureType();
    }

    @Override
    protected ReferencedEnvelope getBoundsInternal(Query query) throws IOException {
        return delegate.getBoundsInternal(query);
    }

    @Override
    protected int getCountInternal(Query query) throws IOException {
        return delegate.getCountInternal(query);
    }

    @Override
    protected FeatureReader<SimpleFeatureType, SimpleFeature> getReaderInternal(Query query)
            throws IOException {
        return delegate.getReaderInternal(query);
    }
    // internal end

    // public start
    //
    // Public Delegate Methods
    // Implement FeatureSource methods using CSVFeatureSource implementation
    //
    @Override
    public CSVDataStore getDataStore() {
        return delegate.getDataStore();
    }

    @Override
    public ContentEntry getEntry() {
        return delegate.getEntry();
    }

    public Transaction getTransaction() {
        return delegate.getTransaction();
    }

    public ContentState getState() {
        return delegate.getState();
    }

    public ResourceInfo getInfo() {
        return delegate.getInfo();
    }

    public Name getName() {
        return delegate.getName();
    }

    public QueryCapabilities getQueryCapabilities() {
        return delegate.getQueryCapabilities();
    }
    // public start

}

CSVFeatureWriter

The CSVFeatureWriter handles the writing functionality for our CSVFeatureStore. With the new architecture, a new class called CSVIterator is used as our delegate (private CSVIterator iterator;) rather than the CSVFeatureReader.

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.StandardCopyOption;
import java.util.NoSuchElementException;
import org.geotools.data.DataUtilities;
import org.geotools.data.FeatureWriter;
import org.geotools.data.Query;
import org.geotools.data.csv.parse.CSVIterator;
import org.geotools.data.csv.parse.CSVStrategy;
import org.geotools.feature.simple.SimpleFeatureBuilder;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;

/**
 * Iterator supporting writing of feature content.
 *
 * @author Jody Garnett (Boundless)
 * @author Lee Breisacher
 */
public class CSVFeatureWriter implements FeatureWriter<SimpleFeatureType, SimpleFeature> {
    private SimpleFeatureType featureType;
    private CSVStrategy csvStrategy;
    private CSVFileState csvFileState;

    /** Temporary file used to stage output */
    private File temp;

    /** iterator handling reading of original file */
    private CSVIterator iterator;

    /** CsvWriter used for temp file output */
    private CsvWriter csvWriter;

    /** Flag indicating we have reached the end of the file */
    private boolean appending = false;

    /** Row count used to generate FeatureId when appending */
    int nextRow = 0;

    /** Current feature available for modification. May be null if feature removed */
    private SimpleFeature currentFeature;

    // docs start CSVFeatureWriter

The feature type we grab for writing is dependent on our strategy; therefore, we must feed CSVFeatureWriter our CSVStrategy and grab the feature type from it. We’ll aslo get our iterator, which reads the file, from our CSVStrategy. Finally, we’ll set up a CsvWriter to write to a new file, temp, with the same headers from our current file.

    public CSVFeatureWriter(CSVFileState csvFileState, CSVStrategy csvStrategy) throws IOException {
        this(csvFileState, csvStrategy, Query.ALL);
    }

    public CSVFeatureWriter(CSVFileState csvFileState, CSVStrategy csvStrategy, Query query)
            throws IOException {
        this.csvFileState = csvFileState;
        File file = csvFileState.getFile();
        File directory = file.getParentFile();
        String typeName = query.getTypeName();
        this.temp = File.createTempFile(typeName + System.currentTimeMillis(), "csv", directory);
        this.featureType = csvStrategy.getFeatureType();
        this.iterator = csvStrategy.iterator();
        this.csvStrategy = csvStrategy;
        this.csvWriter = new CsvWriter(new FileWriter(this.temp), ',');
        this.csvWriter.writeRecord(this.csvFileState.getCSVHeaders());
    }

The hasNext() method will first check if we’re appending content, in which case we are done reading - there is nothing next. Otherwise, it passes off to the CSVIterator’s implementation.

    @Override
    public SimpleFeatureType getFeatureType() {
        return this.featureType;
    }
    // featureType end

    // hasNext start
    @Override
    public boolean hasNext() throws IOException {
        if (csvWriter == null) {
            return false;
        }
        if (this.appending) {
            return false; // reader has no more contents
        }
        return iterator.hasNext();
    }

The next() method will also check if we are appending. If we’re not done reading, we grab the next from our iterator; otherwise, we are done so we want to append content. In this case, it will build the next feature we wish to append. remove() will just mark the current feature to be written as null, preventing it from being written.

    @Override
    public SimpleFeature next()
            throws IOException, IllegalArgumentException, NoSuchElementException {
        if (csvWriter == null) {
            throw new IOException("Writer has been closed");
        }
        if (this.currentFeature != null) {
            this.write(); // the previous one was not written, so do it now.
        }
        try {
            if (!appending) {
                if (iterator.hasNext()) {
                    this.currentFeature = iterator.next();
                    return this.currentFeature;
                } else {
                    this.appending = true;
                }
            }
            String fid = featureType.getTypeName() + "-fid" + nextRow;
            Object values[] = DataUtilities.defaultValues(featureType);

            this.currentFeature = SimpleFeatureBuilder.build(featureType, values, fid);
            return this.currentFeature;
        } catch (IllegalArgumentException invalid) {
            throw new IOException("Unable to create feature:" + invalid.getMessage(), invalid);
        }
    }
    // next end

    // remove start
    /**
     * Mark our {@link #currentFeature} feature as null, it will be skipped when written effectively
     * removing it.
     */
    public void remove() throws IOException {
        this.currentFeature = null; // just mark it done which means it will not get written out.
    }

Finally, the write() method takes our current feature and uses the strategy to encode it. The encoding gives us back this feature as a CsvRecord, which our writer then writes out to the file. Finally, we take the temp file we’ve written to and copy its contents into the file our store holds in CSVFileState.

    public void write() throws IOException {
        if (this.currentFeature == null) {
            return; // current feature has been deleted
        }
        this.csvWriter.writeRecord(this.csvStrategy.encode(this.currentFeature));
        nextRow++;
        this.currentFeature = null; // indicate that it has been written
    }
    // write end

    // close start
    @Override
    public void close() throws IOException {
        if (csvWriter == null) {
            throw new IOException("Writer alread closed");
        }
        if (this.currentFeature != null) {
            this.write(); // the previous one was not written, so do it now.
        }
        // Step 1: Write out remaining contents (if applicable)
        while (hasNext()) {
            next();
            write();
        }
        csvWriter.close();
        csvWriter = null;
        if (this.iterator != null) {
            this.iterator.close();
            this.iterator = null;
        }
        // Step 2: Replace file contents
        File file = this.csvFileState.getFile();

        Files.copy(temp.toPath(), file.toPath(), StandardCopyOption.REPLACE_EXISTING);
    }
}

CSVIterator

The CSVIterator is a helper class primarily for CSVFeatureReader. Much of the old code is now implemented here, and has the added benefit of allowing an iterator to be instantiated for use elsewhere, making the code more general than before. With the addition of the CSVFileState, the class now reads from it instead of the CSVDataStore.

import java.io.IOException;
import java.util.Iterator;
import java.util.NoSuchElementException;
import org.geotools.data.csv.CSVFileState;
import org.opengis.feature.simple.SimpleFeature;

public class CSVIterator implements Iterator<SimpleFeature> {

    private int idx;

    private SimpleFeature next;

    private final CsvReader csvReader;

    private final CSVStrategy csvStrategy;

    public CSVIterator(CSVFileState csvFileState, CSVStrategy csvStrategy) throws IOException {
        this.csvStrategy = csvStrategy;
        csvReader = csvFileState.openCSVReader();
        idx = 1;
        next = null;
    }

Because we’re now using strategy objects to implement functionality, the readFeature() method no longer makes any assumptions about the nature of the data. It is delegated to the strategy to make such a decision. The resulting method is shorter, just passing what it reads off to builders to implement based on the strategy.

    private SimpleFeature readFeature() throws IOException {
        if (csvReader.readRecord()) {
            String[] csvRecord = csvReader.getValues();
            return buildFeature(csvRecord);
        }
        return null;
    }

CSVFileState

The CSVFileState is a new class to assist with File manipulation in our CSVDataStore. It will hold some information about our .csv file and allow it to be opened for reading.

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.net.URI;
import org.apache.commons.io.FilenameUtils;
import org.geotools.referencing.CRS;
import org.opengis.referencing.FactoryException;
import org.opengis.referencing.crs.CoordinateReferenceSystem;

public class CSVFileState {

    private static CoordinateReferenceSystem DEFAULT_CRS() throws FactoryException {
        return CRS.decode("EPSG:4326");
    };

    private final File file;

    private final String typeName;

    private final CoordinateReferenceSystem crs;

    private final URI namespace;

    private final String dataInput;

    private volatile String[] headers = null;

    public CSVFileState(File file) {
        this(file, null, null, null);
    }

    public CSVFileState(File file, URI namespace) {
        this(file, namespace, null, null);
    }

    public CSVFileState(File file, URI namespace, String typeName, CoordinateReferenceSystem crs) {
        this.file = file;
        this.typeName = typeName;
        this.crs = crs;
        this.namespace = namespace;
        this.dataInput = null;
    }

    // used by unit tests
    public CSVFileState(String dataInput) {
        this(dataInput, null);
    }

    public CSVFileState(String dataInput, String typeName) {
        this.dataInput = dataInput;
        this.typeName = typeName;
        this.crs = null;
        this.namespace = null;
        this.file = null;
    }

    public URI getNamespace() {
        return namespace;
    }

    public File getFile() {
        return file;
    }

    public String getTypeName() {
        return typeName != null ? typeName : FilenameUtils.getBaseName(file.getPath());
    }

    public CoordinateReferenceSystem getCrs() {
        if (crs != null) {
            return crs;
        }

        try {
            return CSVFileState.DEFAULT_CRS();
        } catch (FactoryException e) {
            return null;
        }
    }

    // docs start openCSVReader

The class opens the file for reading, ensures it is the correct CSV format, and gives back a CSVReader to read the file through a stream.

    public CsvReader openCSVReader() throws IOException {
        Reader reader;
        if (file != null) {
            reader = new BufferedReader(new FileReader(file));
        } else {
            reader = new StringReader(dataInput);
        }
        CsvReader csvReader = new CsvReader(reader);
        if (!csvReader.readHeaders()) {
            throw new IOException("Error reading csv headers");
        }
        return csvReader;
    }

The readCSVHeaders() and getCSVHeaders() methods grab the headers from the file (thus, leaving just the data).

    public String[] getCSVHeaders() {
        if (headers == null) {
            synchronized (this) {
                if (headers == null) {
                    headers = readCSVHeaders();
                }
            }
        }
        return headers;
    }

    private String[] readCSVHeaders() {
        CsvReader csvReader = null;
        try {
            csvReader = openCSVReader();
            return csvReader.getHeaders();
        } catch (IOException e) {
            throw new RuntimeException("Failure reading csv headers", e);
        } finally {
            if (csvReader != null) {
                csvReader.close();
            }
        }
    }

CSVStrategy

CSVStrategy defines the API used internally by CSVDataStore when converting from CSV Records to Features (and vice versa).

public abstract class CSVStrategy {

    protected final CSVFileState csvFileState;

    public CSVStrategy(CSVFileState csvFileState) {
        this.csvFileState = csvFileState;
    }

    public CSVIterator iterator() throws IOException {
        return new CSVIterator(csvFileState, this);
    }

The name “strategy” comes form the strategy pattern - where an object (the strategy) is injected into our CSVDataStore to configure it for use. CSVDataStore will call the strategy object as needed (rather than have a bunch of switch/case statements inside each method).

Subclasses of CSVStrategy will need to implement:

  • buildFeatureType() - generate a FeatureType (from the CSV file headers - and possibly a scan off the data)
  • createSchema(SimpleFeatureType) - create a new file using the provided feature type
  • decode(String, String[]) - decode a record from the csv file
  • encode(SimpleFeature) - encode a feature as a record (to be written to the csv file)

This API is captured as an abstract class which can be subclassed for specific strategies. The strategy objects are used by the CSVDataStore to determine how certain methods will operate: by passing the strategy objects into the CSVDataStore, their implementation is used. Through this design, we can continue extending the abilities of the CSVDataStore in the future much more easily.

The base class has some support methods available for use by all the strategy objects. The createBuilder() methods are helpers that set some of the common portions for the SimpleFeatureBuilder utility object, such as the type name, coordinate reference system, namespace URI, and then the column headers.

public abstract class CSVStrategy {

    protected final CSVFileState csvFileState;

    public CSVStrategy(CSVFileState csvFileState) {
        this.csvFileState = csvFileState;
    }

    public CSVIterator iterator() throws IOException {
        return new CSVIterator(csvFileState, this);
    }

The findMostSpecificTypesFromData(CsvReader csvReader, String[] headers) method attempts to find the type of the data being read. It attempts to read it as an Integer first, and if the format is incorrect, it tries a Double next, and if the format is still incorrect, it just defaults to a String type. It scans the entire file when doing so to ensure that later on the values do not change to a different type.

    /**
     * Originally in a strategy support class - giving a chance to override them to improve
     * efficiency and utilize the different strategies
     */
    public static SimpleFeatureTypeBuilder createBuilder(CSVFileState csvFileState) {
        CsvReader csvReader = null;
        Map<String, Class<?>> typesFromData = null;
        String[] headers = null;
        try {
            csvReader = csvFileState.openCSVReader();
            headers = csvReader.getHeaders();
            typesFromData = findMostSpecificTypesFromData(csvReader, headers);
        } catch (IOException e) {
            throw new RuntimeException("Failure reading csv file", e);
        } finally {
            if (csvReader != null) {
                csvReader.close();
            }
        }
        return createBuilder(csvFileState, headers, typesFromData);
    }

    public static SimpleFeatureTypeBuilder createBuilder(
            CSVFileState csvFileState, String[] headers, Map<String, Class<?>> typesFromData) {
        SimpleFeatureTypeBuilder builder = new SimpleFeatureTypeBuilder();
        builder.setName(csvFileState.getTypeName());
        builder.setCRS(csvFileState.getCrs());
        if (csvFileState.getNamespace() != null) {
            builder.setNamespaceURI(csvFileState.getNamespace());
        }
        for (String col : headers) {
            Class<?> type = typesFromData.get(col);
            builder.add(col, type);
        }
        return builder;
    }

    /**
     * Performs a full file scan attempting to guess the type of each column Specific strategy
     * implementations will expand this functionality by overriding the buildFeatureType() method.
     */
    protected static Map<String, Class<?>> findMostSpecificTypesFromData(
            CsvReader csvReader, String[] headers) throws IOException {
        Map<String, Class<?>> result = new HashMap<String, Class<?>>();
        // start off assuming Integers for everything
        for (String header : headers) {
            result.put(header, Integer.class);
        }
        // Read through the whole file in case the type changes in later rows
        while (csvReader.readRecord()) {
            String[] record = csvReader.getValues();
            List<String> values = Arrays.asList(record);
            if (record.length >= headers.length) {
                values = values.subList(0, headers.length);
            }
            int i = 0;
            for (String value : values) {
                String header = headers[i];
                Class<?> type = result.get(header);
                // For each value in the row, ensure we can still parse it as the
                // defined type for this column; if not, make it more general
                if (type == Integer.class) {
                    try {
                        Integer.parseInt(value);
                    } catch (NumberFormatException e) {
                        try {
                            Double.parseDouble(value);
                            type = Double.class;
                        } catch (NumberFormatException ex) {
                            type = String.class;
                        }
                    }
                } else if (type == Double.class) {
                    try {
                        Double.parseDouble(value);
                    } catch (NumberFormatException e) {
                        type = String.class;
                    }
                }
                result.put(header, type);
                i++;
            }
        }
        return result;
    }

CSVAttributesOnlyStrategy

The CSVAttributesOnlyStrategy is the simplest implementation. It directly reads the file and obtains the values as attributes for the feature. The feature type is built using helper methods from a support class which will be visited later. The headers from the .csv file are read in as attributes for this feature. Each header is an attribute defined in that column, and each row provides the values for all the attributes of one feature. The csvRecord parameter contains one line of data read in from the file, and each String is mapped to its attribute. The SimpleFeatureBuilder utility class uses all the data to build this feature.

import com.vividsolutions.jts.geom.Geometry;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.geotools.data.csv.CSVFileState;
import org.geotools.feature.simple.SimpleFeatureBuilder;
import org.geotools.feature.simple.SimpleFeatureTypeBuilder;
import org.opengis.feature.Property;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;
import org.opengis.feature.type.AttributeDescriptor;
import org.opengis.feature.type.GeometryDescriptor;

public class CSVAttributesOnlyStrategy extends CSVStrategy {

    public CSVAttributesOnlyStrategy(CSVFileState csvFileState) {
        super(csvFileState);
    }

    @Override
    protected SimpleFeatureType buildFeatureType() {
        SimpleFeatureTypeBuilder builder = createBuilder(csvFileState);
        return builder.buildFeatureType();
    }

    @Override
    public void createSchema(SimpleFeatureType featureType) throws IOException {
        List<String> header = new ArrayList<String>();
        this.featureType = featureType;
        for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) {
            if (descriptor instanceof GeometryDescriptor) continue;
            header.add(descriptor.getLocalName());
        }

        // Write out header, producing an empty file of the correct type
        CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ',');
        try {
            writer.writeRecord(header.toArray(new String[header.size()]));
        } finally {
            writer.close();
        }
    }

    @Override
    public String[] encode(SimpleFeature feature) {
        List<String> csvRecord = new ArrayList<String>();
        for (Property property : feature.getProperties()) {
            Object value = property.getValue();
            if (value == null) {
                csvRecord.add("");
            } else if (Geometry.class.isAssignableFrom(value.getClass())) {
                // skip geometries
            } else {
                String txt = value.toString();
                csvRecord.add(txt);
            }
        }
        return csvRecord.toArray(new String[csvRecord.size() - 1]);
    }

    @Override
    public SimpleFeature decode(String recordId, String[] csvRecord) {
        SimpleFeatureType featureType = getFeatureType();
        SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType);
        String[] headers;
        headers = csvFileState.getCSVHeaders();
        for (int i = 0; i < headers.length; i++) {
            String header = headers[i];
            if (i < csvRecord.length) {
                String value = csvRecord[i].trim();
                builder.set(header, value);
            } else {
                // geotools converters take care of converting for us
                builder.set(header, null);
            }
        }
        return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId);
    }
}

CSVLatLonStrategy

The CSVLatLonStrategy provides the additional component of supplanting Latitude and Longitude fields with a Point geometry. We search through the headers to see if there is a match for both Latitude and Longitude, and if so, we remove those attributes and replace it with the Point geometry. The user can specify the strings to use to search for the Lat and Lon columns (for example, “LAT” and “LON”). Otherwise, the class will attempt to parse for a valid lat/lon spelling. The user can also choose to name the geometry column, or else it will default to “location”. Using this information, it builds the feature type.

public class CSVLatLonStrategy extends CSVStrategy {

    private String latField;

    private String lngField;

    private String pointField;

    public CSVLatLonStrategy(CSVFileState csvFileState) {
        this(csvFileState, null, null);
    }

    public CSVLatLonStrategy(CSVFileState csvFileState, String latField, String lngField) {
        this(csvFileState, latField, lngField, "location");
    }

    public CSVLatLonStrategy(
            CSVFileState csvFileState, String latField, String lngField, String pointField) {
        super(csvFileState);
        this.latField = latField;
        this.lngField = lngField;
        this.pointField = pointField;
    }

    @Override
    protected SimpleFeatureType buildFeatureType() {
        String[] headers;
        Map<String, Class<?>> typesFromData;
        CsvReader csvReader = null;
        try {
            csvReader = csvFileState.openCSVReader();
            headers = csvReader.getHeaders();
            typesFromData = findMostSpecificTypesFromData(csvReader, headers);
        } catch (IOException e) {
            throw new RuntimeException(e);
        } finally {
            if (csvReader != null) {
                csvReader.close();
            }
        }
        SimpleFeatureTypeBuilder builder = createBuilder(csvFileState, headers, typesFromData);

        // If the lat/lon fields were not specified, figure out their spelling now
        if (latField == null || lngField == null) {
            for (String col : headers) {
                if (isLatitude(col)) {
                    latField = col;
                } else if (isLongitude(col)) {
                    lngField = col;
                }
            }
        }

        // For LatLon strategy, we need to change the Lat and Lon columns
        // to be recognized as a Point rather than two numbers, if the
        // values in the respective columns are all accurate (numeric)
        Class<?> latClass = typesFromData.get(latField);
        Class<?> lngClass = typesFromData.get(lngField);
        if (isNumeric(latClass) && isNumeric(lngClass)) {
            List<String> csvHeaders = Arrays.asList(headers);
            int index = csvHeaders.indexOf(latField);
            AttributeTypeBuilder builder2 = new AttributeTypeBuilder();
            builder2.setCRS(DefaultGeographicCRS.WGS84);
            builder2.binding(Point.class);
            AttributeDescriptor descriptor = builder2.buildDescriptor(pointField);
            builder.add(index, descriptor);

            builder.remove(latField);
            builder.remove(lngField);
        }
        return builder.buildFeatureType();
    }

    private boolean isLatitude(String s) {
        return "latitude".equalsIgnoreCase(s) || "lat".equalsIgnoreCase(s);
    }

    private boolean isLongitude(String s) {
        return "lon".equalsIgnoreCase(s)
                || "lng".equalsIgnoreCase(s)
                || "long".equalsIgnoreCase(s)
                || "longitude".equalsIgnoreCase(s);
    }

When encoding the feature, the geometry will grab the Y value first (latitude) and the X value second (longitude). This is in compliance with the standards by WGS84. Otherwise, it works the same as the attributes only strategy.

    @Override
    public String[] encode(SimpleFeature feature) {
        List<String> csvRecord = new ArrayList<String>();
        String[] headers = csvFileState.getCSVHeaders();
        int latIndex = 0;
        int lngIndex = 0;
        for (int i = 0; i < headers.length; i++) {
            if (headers[i].equalsIgnoreCase(latField)) {
                latIndex = i;
            }
            if (headers[i].equalsIgnoreCase(lngField)) {
                lngIndex = i;
            }
        }
        for (Property property : feature.getProperties()) {
            Object value = property.getValue();
            if (value == null) {
                csvRecord.add("");
            } else if (value instanceof Point) {
                Point point = (Point) value;
                if (lngIndex < latIndex) {
                    csvRecord.add(Double.toString(point.getY()));
                    csvRecord.add(Double.toString(point.getX()));
                } else {
                    csvRecord.add(Double.toString(point.getX()));
                    csvRecord.add(Double.toString(point.getY()));
                }

            } else {
                String txt = value.toString();
                csvRecord.add(txt);
            }
        }
        return csvRecord.toArray(new String[csvRecord.size() - 1]);
    }

When decoding a CsvRecord into a feature, we parse for the latField and lngField and store those values. At the end if we’ve successfully grabbed both a latitude and longitude, we create it as a Point in our feature.

    @Override
    public SimpleFeature decode(String recordId, String[] csvRecord) {
        SimpleFeatureType featureType = getFeatureType();
        SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType);
        GeometryDescriptor geometryDescriptor = featureType.getGeometryDescriptor();
        GeometryFactory geometryFactory = new GeometryFactory();
        Double lat = null, lng = null;
        String[] headers = csvFileState.getCSVHeaders();
        for (int i = 0; i < headers.length; i++) {
            String header = headers[i];
            if (i < csvRecord.length) {
                String value = csvRecord[i].trim();
                if (geometryDescriptor != null && header.equals(latField)) {
                    lat = Double.valueOf(value);
                } else if (geometryDescriptor != null && header.equals(lngField)) {
                    lng = Double.valueOf(value);
                } else {
                    builder.set(header, value);
                }
            } else {
                builder.set(header, null);
            }
        }
        if (geometryDescriptor != null && lat != null && lng != null) {
            Coordinate coordinate;
            if (geometryDescriptor
                    .getCoordinateReferenceSystem()
                    .getCoordinateSystem()
                    .getAxis(0)
                    .getDirection()
                    .equals(AxisDirection.NORTH)) {
                coordinate = new Coordinate(lng, lat);
            } else {
                coordinate = new Coordinate(lat, lng);
            }

            Point point = geometryFactory.createPoint(coordinate);
            builder.set(geometryDescriptor.getLocalName(), point);
        }
        return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId);
    }

For our createSchema() method, we search for the geometry column that we should have created - specified with WGS84 as the CRS - and if successful, we add our specified latField and lngField to the header. If unsuccessful, we throw an IOException. The rest of the columns just use the names they were given. If we find a GeometryDescriptor, we skip it because that was our Lat/Lon column. Everything else in this strategy is just stored as an Attribute. Finally, the header is written using the CsvWriter.

    @Override
    public void createSchema(SimpleFeatureType featureType) throws IOException {
        List<String> header = new ArrayList<String>();

        GeometryDescriptor geometryDescrptor = featureType.getGeometryDescriptor();
        CoordinateReferenceSystem crs = geometryDescrptor.getCoordinateReferenceSystem();
        if (geometryDescrptor != null
                && CRS.equalsIgnoreMetadata(DefaultGeographicCRS.WGS84, crs)
                && geometryDescrptor.getType().getBinding().isAssignableFrom(Point.class)) {
            if (crs.getCoordinateSystem().getAxis(0).getDirection().equals(AxisDirection.NORTH)) {
                header.add(this.latField);
                header.add(this.lngField);
            } else {
                header.add(this.lngField);
                header.add(this.latField);
            }
        } else {
            throw new IOException(
                    "Unable use "
                            + this.latField
                            + "/"
                            + this.lngField
                            + " to represent "
                            + geometryDescrptor);
        }
        for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) {
            if (descriptor instanceof GeometryDescriptor) continue;
            header.add(descriptor.getLocalName());
        }
        // Write out header, producing an empty file of the correct type
        CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ',');
        try {
            writer.writeRecord(header.toArray(new String[header.size()]));
        } finally {
            writer.close();
        }
    }

CSVSpecifiedWKTStrategy

CSVSpecifiedWKTStrategy is the strategy used for a Well-Known-Text (WKT) format. A specified WKT must be passed to the strategy to be used to parse for the WKT.

Similar to the CSVLatLonStrategy, a specified WKT must be passed to the strategy to be used to parse for the WKT. If found, it attaches the Geometry class to the WKT in the header.

import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.geotools.data.csv.CSVFileState;
import org.geotools.feature.AttributeTypeBuilder;
import org.geotools.feature.simple.SimpleFeatureBuilder;
import org.geotools.feature.simple.SimpleFeatureTypeBuilder;
import org.geotools.referencing.crs.DefaultGeographicCRS;
import org.geotools.util.Converters;
import org.opengis.feature.Property;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;
import org.opengis.feature.type.AttributeDescriptor;
import org.opengis.feature.type.GeometryDescriptor;

public class CSVSpecifiedWKTStrategy extends CSVStrategy {

    private final String wktField;

    public CSVSpecifiedWKTStrategy(CSVFileState csvFileState, String wktField) {
        super(csvFileState);
        this.wktField = wktField;
    }

    // docs start buildFeatureType

To build the feature type with this strategy, the only thing that needs to be changed is updating the specified WKT field. Instead of reading this data as an Integer, Double or String (as in the base CSVStrategy class’s createBuilder() method), we want to use a Geometry class to store the information in the WKT Field’s column. To do this, we create an AttributeBuilder, set our CRS to WGS84 and the binding to Geometry.class. We get an AttributeDescriptor from this builder, suppling it with the wktField specified as its name. Then we set the featureBuilder with this AttributeDescriptor, it overwrites it with the new information.

    @Override
    protected SimpleFeatureType buildFeatureType() {
        SimpleFeatureTypeBuilder featureBuilder = createBuilder(csvFileState);
        // For WKT strategy, we need to make sure the wktField is recognized as a Geometry
        AttributeDescriptor descriptor = featureBuilder.get(wktField);
        if (descriptor != null) {
            AttributeTypeBuilder attributeBuilder = new AttributeTypeBuilder();
            attributeBuilder.init(descriptor);
            attributeBuilder.setCRS(DefaultGeographicCRS.WGS84);
            attributeBuilder.binding(Geometry.class);

            AttributeDescriptor modified = attributeBuilder.buildDescriptor(wktField);
            featureBuilder.set(modified);
        }
        return featureBuilder.buildFeatureType();
    }

For creating the schema, the only thing we search for is a GeometryDescriptor, which we will know is our wktField. Otherwise, we just use the names they were given.

    @Override
    public void createSchema(SimpleFeatureType featureType) throws IOException {
        List<String> header = new ArrayList<String>();

        for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) {
            if (descriptor instanceof GeometryDescriptor) {
                header.add(wktField);
            } else {
                header.add(descriptor.getLocalName());
            }
        }
        // Write out header, producing an empty file of the correct type
        CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ',');
        try {
            writer.writeRecord(header.toArray(new String[header.size()]));
        } finally {
            writer.close();
        }
    }

When encoding a feature, we simply parse for the wktField described by the strategy. If found, we use a WKTWriter to correctly write out the Geometry as a WKT field, which is then added to our CsvRecord. Otherwise, the value is passed to a utility method convert() which will write the value out as a String.

    @Override
    public String[] encode(SimpleFeature feature) {
        List<String> csvRecord = new ArrayList<String>();
        for (Property property : feature.getProperties()) {
            String name = property.getName().getLocalPart();
            Object value = property.getValue();
            if (value == null) {
                csvRecord.add("");
            } else if (name.compareTo(wktField) == 0) {
                WKTWriter wkt = new WKTWriter();
                String txt = wkt.write((Geometry) value);
                csvRecord.add(txt);
            } else {
                String txt = Converters.convert(value, String.class);
                csvRecord.add(txt);
            }
        }
        return csvRecord.toArray(new String[csvRecord.size() - 1]);
    }

When decoding a CsvRecord, we check if we are in the WKT column (current header value is the wktField specified) and if we have a GeometryDescriptor in our featureType. If both are true, we create a WKTReader to read the value as a Geometry type so that we can build our feature with this Geometry. If it fails for some reason, the exception is caught and the attribute is treated as null.

    @Override
    public SimpleFeature decode(String recordId, String[] csvRecord) {
        SimpleFeatureType featureType = getFeatureType();
        SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType);
        GeometryDescriptor geometryDescriptor = featureType.getGeometryDescriptor();
        String[] headers = csvFileState.getCSVHeaders();
        for (int i = 0; i < headers.length; i++) {
            String header = headers[i];
            if (i < csvRecord.length) {
                String value = csvRecord[i].trim();
                if (geometryDescriptor != null && header.equals(wktField)) {
                    WKTReader wktReader = new WKTReader();
                    Geometry geometry;
                    try {
                        geometry = wktReader.read(value);
                    } catch (ParseException e) {
                        // policy decision here that just nulls out unparseable geometry
                        geometry = null;
                    }
                    builder.set(wktField, geometry);
                } else {
                    builder.set(header, value);
                }
            } else {
                builder.set(header, null);
            }
        }
        return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId);
    }