Strategy Patterns¶
Since this tutorial has been written the result has been broken out into a distinct gt-csv
module. The work has also been forked into service as part of the GeoServer importer module. Originally in GeoServer’s fork, a nifty CSVStrategy
was added (with implementations for CSVAttributesOnlyStrategy
, CSVLatLonStrategy
, CSVSpecifiedLatLngStrategy
and CSVSpecifiedWKTStrategy
). In this section we’ll discuss the new implementation which is a cleaning up and improvement on the strategy pattern they implemented, with read and write functionality. If you want to follow along with the code, it is available as a plugin:
CSVDataStore
¶
CSVDataStore
now uses a CSVStrategy
and CSVFileState
. The CSVFileState
holds information about the file we are reading from. CSVStrategy
is an abstract class for the strategy objects CSVDataStore
can use.
import org.geotools.api.filter.Filter; import org.geotools.data.store.ContentDataStore; import org.geotools.data.store.ContentEntry; import org.geotools.data.store.ContentFeatureSource; import org.geotools.feature.NameImpl; import org.geotools.tutorial.csv3.parse.CSVStrategy; public class CSVDataStore extends ContentDataStore implements FileDataStore { private final CSVStrategy csvStrategy; private final CSVFileState csvFileState; public CSVDataStore(CSVFileState csvFileState, CSVStrategy csvStrategy) { this.csvFileState = csvFileState; this.csvStrategy = csvStrategy; } // docs start getTypeName
Using the CSVFileState
to do work for us, the getTypeName()
method to create Name
is much simpler.
public Name getTypeName() { if (namespaceURI != null) { return new NameImpl(namespaceURI, csvFileState.getTypeName()); } else { return new NameImpl(csvFileState.getTypeName()); } }
CSVDataStore
now implements the FileDataStore
interface to ensure a standard for operations which are performed by FileDataStores
. As such, it must override its methods. Note the use of the CSVStrategy
in order to determine the schema. Depending on the strategy defined, the schema for this store will be different. The implementation of createFeatureSource()
checks to make sure the file is writable before allowing the writing of features. If it is, it actually uses a CSVFeatureStore
instead of a CSVFeatureSource
, which is a data structure that will allow being written to as well as read from.
@Override protected List<Name> createTypeNames() throws IOException { return Collections.singletonList(getTypeName()); } @Override protected ContentFeatureSource createFeatureSource(ContentEntry entry) throws IOException { if (csvFileState.getFile().canWrite()) { return new CSVFeatureStore(csvStrategy, csvFileState, entry, Query.ALL); } else { return new CSVFeatureSource(entry, Query.ALL); } } @Override public SimpleFeatureType getSchema() throws IOException { return this.csvStrategy.getFeatureType(); } @Override public void updateSchema(SimpleFeatureType featureType) throws IOException { throw new UnsupportedOperationException(); } @Override public SimpleFeatureSource getFeatureSource() throws IOException { return new CSVFeatureSource(this); } @Override public FeatureReader<SimpleFeatureType, SimpleFeature> getFeatureReader() throws IOException { return new CSVFeatureSource(this).getReader(); } @Override public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriter(Filter filter, Transaction transaction) throws IOException { return super.getFeatureWriter(this.csvFileState.getTypeName(), filter, transaction); } @Override public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriter(Transaction transaction) throws IOException { return super.getFeatureWriter(this.csvFileState.getTypeName(), transaction); } @Override public FeatureWriter<SimpleFeatureType, SimpleFeature> getFeatureWriterAppend(Transaction transaction) throws IOException { return super.getFeatureWriterAppend(this.csvFileState.getTypeName(), transaction); } public CSVStrategy getCSVStrategy() { return csvStrategy; } @Override public void createSchema(SimpleFeatureType featureType) throws IOException { this.csvStrategy.createSchema(featureType); }
CSVDataStoreFactory
¶
The new architecture with the added strategy objects expands the CSVDataStoreFactory
’s capabilities. It contains a few more Param
fields now.
Much of the class’s structure is improved to be more compartmentalized. The metadata is mostly the same with some data now being held in class fields rather than literals.
public class CSVDataStoreFactory implements FileDataStoreFactorySpi { /** GUESS_STRATEGY */ public static final String GUESS_STRATEGY = "guess"; /** ATTRIBUTES_ONLY_STRATEGY */ public static final String ATTRIBUTES_ONLY_STRATEGY = "AttributesOnly"; /** SPECIFC_STRATEGY */ public static final String SPECIFC_STRATEGY = "specify"; /** WKT_STRATEGY */ public static final String WKT_STRATEGY = "wkt"; private static final String FILE_TYPE = "csv"; public static final String[] EXTENSIONS = {"." + FILE_TYPE}; public static final Param FILE_PARAM = new Param("file", File.class, FILE_TYPE + " file", false); public static final Param URL_PARAM = new Param("url", URL.class, FILE_TYPE + " file", false); public static final Param NAMESPACEP = new Param("namespace", URI.class, "uri to the namespace", false, null, new KVP(Param.LEVEL, "advanced")); public static final Param STRATEGYP = new Param("strategy", String.class, "strategy", false); public static final Param LATFIELDP = new Param("latField", String.class, "Latitude field. Assumes a CSVSpecifiedLatLngStrategy", false); public static final Param LnGFIELDP = new Param("lngField", String.class, "Longitude field. Assumes a CSVSpecifiedLatLngStrategy", false); public static final Param WKTP = new Param("wktField", String.class, "WKT field. Assumes a CSVSpecifiedWKTStrategy", false); public static final Param QUOTEALL = new Param( "quoteAll", Boolean.class, "Should all fields be quoted (true) or just ones that need it (false)", false, Boolean.FALSE, new KVP(Param.LEVEL, "advanced")); public static final Param QUOTECHAR = new Param( "quoteChar", Character.class, "Character to be used to quote attributes", false, '"', new KVP(Param.LEVEL, "advanced")); public static final Param SEPERATORCHAR = new Param( "seperator", Character.class, "Character to be used to seperate records", false, ',', new KVP(Param.LEVEL, "advanced")); public static final Param[] parametersInfo = {FILE_PARAM, NAMESPACEP, STRATEGYP, LATFIELDP, LnGFIELDP, WKTP};
The method isAvailable()
just attempts to read the class, and if it succeeds, returns true.
@Override @SuppressWarnings("ReturnValueIgnored") public boolean isAvailable() { try { CSVDataStore.class.getName(); } catch (Exception e) { return false; } return true; }
The canProcess(Map<String, ?> params)
method was made more tolerant, now accepting URL and File parameters through the fileFromParams(Map<String, ?> params)
method. It will try File first, then URL before giving up.
@Override public boolean canProcess(URL url) { return canProcessExtension(URLs.urlToFile(url).toString()); }
Finally, the different strategies are implemented in the createDataStoreFromFile()
method. The method is overloaded to make some parameters optional, which the class will then fill in for us.
public FileDataStore createDataStoreFromFile(File file) throws IOException { return createDataStoreFromFile(file, null); } public FileDataStore createDataStoreFromFile(File file, URI namespace) throws IOException { if (file == null) { throw new IllegalArgumentException("Cannot create store from null file"); } else if (!file.exists()) { throw new IllegalArgumentException("Cannot create store with file that does not exist"); } Map<String, Serializable> noParams = Collections.emptyMap(); return createDataStoreFromFile(file, namespace, noParams); }
CSVFeatureReader
¶
The CSVFeatureReader
now delegates much of the functionality to a new class called CSVIterator
as well as the CSVStrategy
. The resulting code is very clean and short.
import org.geotools.tutorial.csv3.parse.CSVIterator; import org.geotools.tutorial.csv3.parse.CSVStrategy; public class CSVFeatureReader implements FeatureReader<SimpleFeatureType, SimpleFeature> { private SimpleFeatureType featureType; private CSVIterator iterator; public CSVFeatureReader(CSVStrategy csvStrategy) throws IOException { this(csvStrategy, Query.ALL); } public CSVFeatureReader(CSVStrategy csvStrategy, Query query) throws IOException { this.featureType = csvStrategy.getFeatureType(); this.iterator = csvStrategy.iterator(); } @Override public SimpleFeatureType getFeatureType() { return featureType; } @Override public void close() throws IOException { iterator.close(); } @Override public SimpleFeature next() throws IOException, IllegalArgumentException, NoSuchElementException { return iterator.next(); } @Override public boolean hasNext() throws IOException { return iterator.hasNext(); } }
CSVFeatureSource
¶
CSVFeatureSource
retains the same basic structure, but the code is assisted by the new classes. It now overloads the constructor:
import org.geotools.data.store.ContentEntry; import org.geotools.data.store.ContentFeatureSource; import org.geotools.geometry.jts.ReferencedEnvelope; public class CSVFeatureSource extends ContentFeatureSource { public CSVFeatureSource(CSVDataStore datastore) { this(datastore, Query.ALL); } public CSVFeatureSource(CSVDataStore datastore, Query query) { this(new ContentEntry(datastore, datastore.getTypeName()), query); } public CSVFeatureSource(ContentEntry entry) { this(entry, Query.ALL); } public CSVFeatureSource(ContentEntry entry, Query query) { super(entry, query); }
The getBoundsInternal(Query query)
method is now implemented by making use of the methods provided by ContentFeatureSource
. A new ReferencedEnvelope
is created to store the bounds for this feature source. It uses the feature type (getSchema()
) to determine the CRS (getCoordinateReferenceSystem()
) - this information is used to construct the bounds for the feature. The FeatureReader
is now created by using the Query
and CSVStrategy
- the getReader()
method calls getReaderInternal(Query query)
which shows how it is created. Finally, using the reader, the features are cycled through and included in the bounds in order to calculate the bounds for this entire datastore.
protected ReferencedEnvelope getBoundsInternal(Query query) throws IOException { ReferencedEnvelope bounds = new ReferencedEnvelope(getSchema().getCoordinateReferenceSystem()); try (FeatureReader<SimpleFeatureType, SimpleFeature> featureReader = getReader(query)) { while (featureReader.hasNext()) { SimpleFeature feature = featureReader.next(); bounds.include(feature.getBounds()); } } return bounds; }
The getReaderInternal(Query query)
method now utilizes the strategy of the CSVDataStore
rather than state to reflect the changes to the CSVFeatureReader
design.
protected FeatureReader<SimpleFeatureType, SimpleFeature> getReaderInternal(Query query) throws IOException { CSVDataStore dataStore = getDataStore(); return new CSVFeatureReader(dataStore.getCSVStrategy(), query); }
The getCountInternal(Query query)
method uses the same idea as getBoundsInternal(Query query)
- it now utilizes the Query and CSVStrategy
to obtain a FeatureReader
, then simply counts them.
protected int getCountInternal(Query query) throws IOException { FeatureReader<SimpleFeatureType, SimpleFeature> featureReader = getReaderInternal(query); int n = 0; try { for (n = 0; featureReader.hasNext(); n++) { featureReader.next(); } } finally { featureReader.close(); } return n; }
The buildFeatureType()
method is now very simple using getSchema()
to grab the feature type of the datastore.
protected SimpleFeatureType buildFeatureType() throws IOException { return getDataStore().getSchema(); }
CSVFeatureStore
¶
CSVFeatureStore
essentially acts as a read/write version of CSVFeatureSource
. Where CSVFeatureSource
is only readable, CSVFeatureStore
adds the ability to write through the use of a CSVFeatureWriter
. The code is updated to use the strategy pattern which it must pass to the writer.
import org.geotools.data.store.ContentEntry; import org.geotools.data.store.ContentFeatureStore; import org.geotools.data.store.ContentState; import org.geotools.geometry.jts.ReferencedEnvelope; import org.geotools.tutorial.csv3.parse.CSVStrategy; /** * Read-write access to CSV File. * * @author Jody Garnett (Boundless) * @author Ian Turton (Envitia) */ public class CSVFeatureStore extends ContentFeatureStore { private CSVStrategy csvStrategy; private CSVFileState csvFileState; public CSVFeatureStore(CSVStrategy csvStrategy, CSVFileState csvFileState, ContentEntry entry, Query query) { super(entry, query); this.csvStrategy = csvStrategy; this.csvFileState = csvFileState; } // header end // getWriter start // // CSVFeatureStore implementations // @Override protected FeatureWriter<SimpleFeatureType, SimpleFeature> getWriterInternal(Query query, int flags) throws IOException { return new CSVFeatureWriter(this.csvFileState, this.csvStrategy, query); } // getWriter end // transaction start /** * Delegate used for FeatureSource methods (We do this because Java cannot inherit from both ContentFeatureStore and * CSVFeatureSource at the same time */ CSVFeatureSource delegate = new CSVFeatureSource(entry, query) { @Override public void setTransaction(Transaction transaction) { super.setTransaction(transaction); CSVFeatureStore.this.setTransaction(transaction); // Keep these two implementations on the same transaction } }; @Override public void setTransaction(Transaction transaction) { super.setTransaction(transaction); if (delegate.getTransaction() != transaction) { delegate.setTransaction(transaction); } } // transaction end // internal start // // Internal Delegate Methods // Implement FeatureSource methods using CSVFeatureSource implementation // @Override protected SimpleFeatureType buildFeatureType() throws IOException { return delegate.buildFeatureType(); } @Override protected ReferencedEnvelope getBoundsInternal(Query query) throws IOException { return delegate.getBoundsInternal(query); } @Override protected int getCountInternal(Query query) throws IOException { return delegate.getCountInternal(query); } @Override protected FeatureReader<SimpleFeatureType, SimpleFeature> getReaderInternal(Query query) throws IOException { return delegate.getReaderInternal(query); } // internal end // public start // // Public Delegate Methods // Implement FeatureSource methods using CSVFeatureSource implementation // @Override public CSVDataStore getDataStore() { return delegate.getDataStore(); } @Override public ContentEntry getEntry() { return delegate.getEntry(); } public Transaction getTransaction() { return delegate.getTransaction(); } public ContentState getState() { return delegate.getState(); } public ResourceInfo getInfo() { return delegate.getInfo(); } public Name getName() { return delegate.getName(); } public QueryCapabilities getQueryCapabilities() { return delegate.getQueryCapabilities(); } // public start }
CSVFeatureWriter
¶
The CSVFeatureWriter
handles the writing functionality for our CSVFeatureStore
. With the new architecture, a new class called CSVIterator
is used as our delegate (private CSVIterator iterator;
) rather than the CSVFeatureReader
.
import org.geotools.data.DataUtilities; import org.geotools.feature.simple.SimpleFeatureBuilder; import org.geotools.tutorial.csv3.parse.CSVIterator; import org.geotools.tutorial.csv3.parse.CSVStrategy; /** * Iterator supporting writing of feature content. * * @author Jody Garnett (Boundless) * @author Lee Breisacher */ public class CSVFeatureWriter implements FeatureWriter<SimpleFeatureType, SimpleFeature> { private SimpleFeatureType featureType; private CSVStrategy csvStrategy; private CSVFileState csvFileState; /** Temporary file used to stage output */ private File temp; /** iterator handling reading of original file */ private CSVIterator iterator; /** CsvWriter used for temp file output */ private CSVWriter csvWriter; /** Flag indicating we have reached the end of the file */ private boolean appending = false; /** Row count used to generate FeatureId when appending */ int nextRow = 0; /** Current feature available for modification. May be null if feature removed */ private SimpleFeature currentFeature; // docs start CSVFeatureWriter
The feature type we grab for writing is dependent on our strategy; therefore, we must feed CSVFeatureWriter
our CSVStrategy
and grab the feature type from it. We’ll also get our iterator, which reads the file, from our CSVStrategy
. Finally, we’ll set up a CSVWriter
to write to a new file, temp, with the same headers from our current file.
public CSVFeatureWriter(CSVFileState csvFileState, CSVStrategy csvStrategy) throws IOException { this(csvFileState, csvStrategy, Query.ALL); } public CSVFeatureWriter(CSVFileState csvFileState, CSVStrategy csvStrategy, Query query) throws IOException { this.csvFileState = csvFileState; File file = csvFileState.getFile(); File directory = file.getParentFile(); String typeName = query.getTypeName(); this.temp = File.createTempFile(typeName + System.currentTimeMillis(), ".csv", directory); this.temp.deleteOnExit(); this.featureType = csvStrategy.getFeatureType(); this.iterator = csvStrategy.iterator(); this.csvStrategy = csvStrategy; this.csvWriter = new CSVWriter(new FileWriter(this.temp)); this.csvWriter.writeNext(this.csvFileState.getCSVHeaders(), false); }
The hasNext()
method will first check if we’re appending content, in which case we are done reading - there is nothing next. Otherwise, it passes off to the CSVIterator
’s implementation.
@Override public SimpleFeatureType getFeatureType() { return this.featureType; } // featureType end // hasNext start @Override public boolean hasNext() throws IOException { if (csvWriter == null) { return false; } if (this.appending) { return false; // reader has no more contents } return iterator.hasNext(); }
The next()
method will also check if we are appending. If we’re not done reading, we grab the next from our iterator; otherwise, we are done so we want to append content. In this case, it will build the next feature we wish to append. remove()
will just mark the current feature to be written as null, preventing it from being written.
@Override public SimpleFeature next() throws IOException, IllegalArgumentException, NoSuchElementException { if (csvWriter == null) { throw new IOException("Writer has been closed"); } if (this.currentFeature != null) { this.write(); // the previous one was not written, so do it now. } try { if (!appending) { if (iterator.hasNext()) { this.currentFeature = iterator.next(); return this.currentFeature; } else { this.appending = true; } } String fid = featureType.getTypeName() + "-fid" + nextRow; Object[] values = DataUtilities.defaultValues(featureType); this.currentFeature = SimpleFeatureBuilder.build(featureType, values, fid); return this.currentFeature; } catch (IllegalArgumentException invalid) { throw new IOException("Unable to create feature:" + invalid.getMessage(), invalid); } } // next end // remove start /** Mark our {@link #currentFeature} feature as null, it will be skipped when written effectively removing it. */ public void remove() throws IOException { this.currentFeature = null; // just mark it done which means it will not get written out. }
Finally, the write()
method takes our current feature and uses the strategy to encode
it. The encoding gives us back this feature as a CSVRecord
, which our writer then writes out to the file. Finally, we take the temporary file we’ve written to and copy its contents into the file our store holds in CSVFileState
.
public void write() throws IOException { if (this.currentFeature == null) { return; // current feature has been deleted } this.csvWriter.writeNext(this.csvStrategy.encode(this.currentFeature), false); nextRow++; this.currentFeature = null; // indicate that it has been written } // write end // close start @Override public void close() throws IOException { if (csvWriter == null) { throw new IOException("Writer already closed"); } if (this.currentFeature != null) { this.write(); // the previous one was not written, so do it now. } // Step 1: Write out remaining contents (if applicable) while (hasNext()) { next(); write(); } csvWriter.close(); csvWriter = null; if (this.iterator != null) { this.iterator.close(); this.iterator = null; } // Step 2: Replace file contents File file = this.csvFileState.getFile(); Files.copy(temp.toPath(), file.toPath(), StandardCopyOption.REPLACE_EXISTING); temp.delete(); } }
CSVIterator
¶
The CSVIterator
is a helper class primarily for CSVFeatureReader
. Much of the old code is now implemented here, and has the added benefit of allowing an iterator to be instantiated for use elsewhere, making the code more general than before. With the addition of the CSVFileState
, the class now reads from it instead of the CSVDataStore
.
import org.geotools.tutorial.csv3.CSVFileState; public class CSVIterator implements Iterator<SimpleFeature> { private int idx; private SimpleFeature next; private final CsvReader csvReader; private final CSVStrategy csvStrategy; public CSVIterator(CSVFileState csvFileState, CSVStrategy csvStrategy) throws IOException { this.csvStrategy = csvStrategy; csvReader = csvFileState.openCSVReader(); idx = 1; next = null; }
Because we’re now using strategy objects to implement functionality, the readFeature()
method no longer makes any assumptions about the nature of the data. It is delegated to the strategy to make such a decision. The resulting method is shorter, just passing what it reads off to builders to implement based on the strategy.
private SimpleFeature readFeature() throws IOException { if (csvReader.readRecord()) { String[] csvRecord = csvReader.getValues(); return buildFeature(csvRecord); } return null; }
CSVFileState
¶
The CSVFileState
is a new class to assist with File manipulation in our CSVDataStore
. It will hold some information about our .csv
file and allow it to be opened for reading.
/** Details from comma separated value file. */ public class CSVFileState { private static CoordinateReferenceSystem DEFAULT_CRS() throws FactoryException { return CRS.decode("EPSG:4326"); } ; private final File file; private final String typeName; private final CoordinateReferenceSystem crs; private final URI namespace; private final String dataInput; private volatile String[] headers = null; public CSVFileState(File file) { this(file, null, null, null); } public CSVFileState(File file, URI namespace) { this(file, namespace, null, null); } public CSVFileState(File file, URI namespace, String typeName, CoordinateReferenceSystem crs) { this.file = file; this.typeName = typeName; this.crs = crs; this.namespace = namespace; this.dataInput = null; } // used by unit tests public CSVFileState(String dataInput) { this(dataInput, null); } public CSVFileState(String dataInput, String typeName) { this.dataInput = dataInput; this.typeName = typeName; this.crs = null; this.namespace = null; this.file = null; } public URI getNamespace() { return namespace; } public File getFile() { return file; } public String getTypeName() { return typeName != null ? typeName : FilenameUtils.getBaseName(file.getPath()); } public CoordinateReferenceSystem getCrs() { if (crs != null) { return crs; } try { return CSVFileState.DEFAULT_CRS(); } catch (FactoryException e) { return null; } }
The class opens the file for reading, ensures it is the correct CSV format, and gives back a CSVReader
to read the file through a stream.
@SuppressWarnings("PMD.CloseResource") // reader is wrapped and returned public CsvReader openCSVReader() throws IOException { Reader reader; if (file != null) { reader = new BufferedReader(new FileReader(file)); } else { reader = new StringReader(dataInput); } CsvReader csvReader = new CsvReader(reader); if (!csvReader.readHeaders()) { reader.close(); throw new IOException("Error reading csv headers"); } return csvReader; }
The readCSVHeaders()
and getCSVHeaders()
methods grab the headers from the file (thus, leaving just the data).
public String[] getCSVHeaders() { if (headers == null) { synchronized (this) { if (headers == null) { headers = readCSVHeaders(); } } } return headers; } private String[] readCSVHeaders() { CsvReader csvReader = null; try { csvReader = openCSVReader(); return csvReader.getHeaders(); } catch (IOException e) { throw new RuntimeException("Failure reading csv headers", e); } finally { if (csvReader != null) { csvReader.close(); } } }
CSVStrategy
¶
CSVStrategy
defines the API used internally by CSVDataStore
when converting from CSV Records
to Features
(and vice versa).
public abstract class CSVStrategy { protected final CSVFileState csvFileState; public CSVStrategy(CSVFileState csvFileState) { this.csvFileState = csvFileState; } public CSVIterator iterator() throws IOException { return new CSVIterator(csvFileState, this); }
The name “strategy” comes form the strategy pattern - where an object (the strategy) is injected into our CSVDataStore
to configure it for use. CSVDataStore
will call the strategy object as needed (rather than have a bunch of switch/case statements inside each method).
Sub-classes of CSVStrategy
will need to implement:
buildFeatureType()
- generate aFeatureType
(from the CSV file headers - and possibly a scan off the data)createSchema(SimpleFeatureType)
- create a new file using the provided feature typedecode(String, String[])
- decode a record from the CSV fileencode(SimpleFeature)
- encode a feature as a record (to be written to the CSV file)
This API is captured as an abstract class which can be sub-classed for specific strategies. The strategy objects are used by the CSVDataStore
to determine how certain methods will operate: by passing the strategy objects into the CSVDataStore
, their implementation is used. Through this design, we can continue extending the abilities of the CSVDataStore
in the future much more easily.
The base class has some support methods available for use by all the strategy objects. The createBuilder()
methods are helpers that set some of the common portions for the SimpleFeatureBuilder
utility object, such as the type name, coordinate reference system, namespace URI, and then the column headers.
public abstract class CSVStrategy { protected final CSVFileState csvFileState; public CSVStrategy(CSVFileState csvFileState) { this.csvFileState = csvFileState; } public CSVIterator iterator() throws IOException { return new CSVIterator(csvFileState, this); }
The findMostSpecificTypesFromData(CsvReader csvReader, String[] headers)
method attempts to find the type of the data being read. It attempts to read it as an Integer first, and if the format is incorrect, it tries a Double next, and if the format is still incorrect, it just defaults to a String type. It scans the entire file when doing so to ensure that later on the values do not change to a different type.
/** * Originally in a strategy support class - giving a chance to override them to improve efficiency and utilize the * different strategies */ public static SimpleFeatureTypeBuilder createBuilder(CSVFileState csvFileState) { CsvReader csvReader = null; Map<String, Class<?>> typesFromData = null; String[] headers = null; try { csvReader = csvFileState.openCSVReader(); headers = csvReader.getHeaders(); typesFromData = findMostSpecificTypesFromData(csvReader, headers); } catch (IOException e) { throw new RuntimeException("Failure reading csv file", e); } finally { if (csvReader != null) { csvReader.close(); } } return createBuilder(csvFileState, headers, typesFromData); } public static SimpleFeatureTypeBuilder createBuilder( CSVFileState csvFileState, String[] headers, Map<String, Class<?>> typesFromData) { SimpleFeatureTypeBuilder builder = new SimpleFeatureTypeBuilder(); builder.setName(csvFileState.getTypeName()); builder.setCRS(csvFileState.getCrs()); if (csvFileState.getNamespace() != null) { builder.setNamespaceURI(csvFileState.getNamespace()); } for (String col : headers) { Class<?> type = typesFromData.get(col); builder.add(col, type); } return builder; } /** * Performs a full file scan attempting to guess the type of each column Specific strategy implementations will * expand this functionality by overriding the buildFeatureType() method. */ protected static Map<String, Class<?>> findMostSpecificTypesFromData(CsvReader csvReader, String[] headers) throws IOException { Map<String, Class<?>> result = new HashMap<>(); // start off assuming Integers for everything for (String header : headers) { result.put(header, Integer.class); } // Read through the whole file in case the type changes in later rows while (csvReader.readRecord()) { List<String> values = Arrays.asList(csvReader.getValues()); if (values.size() >= headers.length) { values = values.subList(0, headers.length); } int i = 0; for (String value : values) { String header = headers[i]; Class<?> type = result.get(header); // For each value in the row, ensure we can still parse it as the // defined type for this column; if not, make it more general if (type == Integer.class) { try { Integer.parseInt(value); } catch (NumberFormatException e) { try { Double.parseDouble(value); type = Double.class; } catch (NumberFormatException ex) { type = String.class; } } } else if (type == Double.class) { try { Double.parseDouble(value); } catch (NumberFormatException e) { type = String.class; } } result.put(header, type); i++; } } return result; }
CSVAttributesOnlyStrategy
¶
The CSVAttributesOnlyStrategy
is the simplest implementation. It directly reads the file and obtains the values as attributes for the feature. The feature type is built using helper methods from a support class which will be visited later. The headers from the .csv
file are read in as attributes for this feature. Each header is an attribute defined in that column, and each row provides the values for all the attributes of one feature. The csvRecord
parameter contains one line of data read in from the file, and each String is mapped to its attribute. The SimpleFeatureBuilder
utility class uses all the data to build this feature.
import org.geotools.feature.simple.SimpleFeatureBuilder; import org.geotools.feature.simple.SimpleFeatureTypeBuilder; import org.geotools.tutorial.csv3.CSVFileState; import org.locationtech.jts.geom.Geometry; public class CSVAttributesOnlyStrategy extends CSVStrategy { public CSVAttributesOnlyStrategy(CSVFileState csvFileState) { super(csvFileState); } @Override protected SimpleFeatureType buildFeatureType() { SimpleFeatureTypeBuilder builder = createBuilder(csvFileState); return builder.buildFeatureType(); } @Override public void createSchema(SimpleFeatureType featureType) throws IOException { List<String> header = new ArrayList<>(); this.featureType = featureType; for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) { if (descriptor instanceof GeometryDescriptor) continue; header.add(descriptor.getLocalName()); } // Write out header, producing an empty file of the correct type CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ','); try { writer.writeRecord(header.toArray(new String[header.size()])); } finally { writer.close(); } } @Override public String[] encode(SimpleFeature feature) { List<String> csvRecord = new ArrayList<>(); for (Property property : feature.getProperties()) { Object value = property.getValue(); if (value == null) { csvRecord.add(""); } else if (!Geometry.class.isAssignableFrom(value.getClass())) { // skip geometries String txt = value.toString(); csvRecord.add(txt); } } return csvRecord.toArray(new String[csvRecord.size() - 1]); } @Override public SimpleFeature decode(String recordId, String[] csvRecord) { SimpleFeatureType featureType = getFeatureType(); SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType); String[] headers = csvFileState.getCSVHeaders(); for (int i = 0; i < headers.length; i++) { String header = headers[i]; if (i < csvRecord.length) { String value = csvRecord[i].trim(); builder.set(header, value); } else { // geotools converters take care of converting for us builder.set(header, null); } } return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId); } }
CSVLatLonStrategy
¶
- The
CSVLatLonStrategy
provides the additional component of supplanting Latitude and Longitude fields with a Point geometry. We search through the headers to see if there is a match for both Latitude and Longitude, and if so, we remove those attributes and replace it with the Point geometry. The user can specify the strings to use to search for the latitude and longitude columns (for example,
LAT
andLON
). Otherwise, the class will attempt to parse for a validlat/lon
spelling. The user can also choose to name the geometry column, or else it will default to “location”. Using this information, it builds the feature type.public class CSVLatLonStrategy extends CSVStrategy { /** _CRS */ public static final DefaultGeographicCRS _CRS = DefaultGeographicCRS.WGS84; private String latField; private String lngField; private String pointField; public CSVLatLonStrategy(CSVFileState csvFileState) { this(csvFileState, null, null); } public CSVLatLonStrategy(CSVFileState csvFileState, String latField, String lngField) { this(csvFileState, latField, lngField, "location"); } public CSVLatLonStrategy(CSVFileState csvFileState, String latField, String lngField, String pointField) { super(csvFileState); this.latField = latField; this.lngField = lngField; this.pointField = pointField; } @Override protected SimpleFeatureType buildFeatureType() { String[] headers; Map<String, Class<?>> typesFromData; CsvReader csvReader = null; try { csvReader = csvFileState.openCSVReader(); headers = csvReader.getHeaders(); typesFromData = findMostSpecificTypesFromData(csvReader, headers); } catch (IOException e) { throw new RuntimeException(e); } finally { if (csvReader != null) { csvReader.close(); } } SimpleFeatureTypeBuilder builder = createBuilder(csvFileState, headers, typesFromData); // If the lat/lon fields were not specified, figure out their spelling now if (latField == null || lngField == null) { for (String col : headers) { if (isLatitude(col)) { latField = col; } else if (isLongitude(col)) { lngField = col; } } } // For LatLon strategy, we need to change the Lat and Lon columns // to be recognized as a Point rather than two numbers, if the // values in the respective columns are all accurate (numeric) Class<?> latClass = typesFromData.get(latField); Class<?> lngClass = typesFromData.get(lngField); if (isNumeric(latClass) && isNumeric(lngClass)) { List<String> csvHeaders = Arrays.asList(headers); int index = csvHeaders.indexOf(latField); AttributeTypeBuilder builder2 = new AttributeTypeBuilder(); builder2.setCRS(_CRS); builder2.binding(Point.class); AttributeDescriptor descriptor = builder2.buildDescriptor(pointField); builder.add(index, descriptor); builder.remove(latField); builder.remove(lngField); } return builder.buildFeatureType(); } private boolean isLatitude(String s) { return "latitude".equalsIgnoreCase(s) || "lat".equalsIgnoreCase(s); } private boolean isLongitude(String s) { return "lon".equalsIgnoreCase(s) || "lng".equalsIgnoreCase(s) || "long".equalsIgnoreCase(s) || "longitude".equalsIgnoreCase(s); }
When encoding the feature, the geometry will grab the Y value first (latitude) and the X value second (longitude). This is in compliance with the standards by WGS84. Otherwise, it works the same as the attributes only strategy.
@Override public String[] encode(SimpleFeature feature) { List<String> csvRecord = new ArrayList<>(); String[] headers = csvFileState.getCSVHeaders(); int latIndex = 0; int lngIndex = 0; for (int i = 0; i < headers.length; i++) { if (headers[i].equalsIgnoreCase(latField)) { latIndex = i; } if (headers[i].equalsIgnoreCase(lngField)) { lngIndex = i; } } for (Property property : feature.getProperties()) { Object value = property.getValue(); if (value == null) { csvRecord.add(""); } else if (value instanceof Point) { Point point = (Point) value; if (lngIndex < latIndex) { csvRecord.add(Double.toString(point.getY())); csvRecord.add(Double.toString(point.getX())); } else { csvRecord.add(Double.toString(point.getX())); csvRecord.add(Double.toString(point.getY())); } } else { String txt = value.toString(); csvRecord.add(txt); } } return csvRecord.toArray(new String[csvRecord.size() - 1]); }
When decoding a CsvRecord
into a feature, we parse for the latField
and lngField
and store those values. At the end if we’ve successfully grabbed both a latitude and longitude, we create it as a Point
in our feature.
@Override public SimpleFeature decode(String recordId, String[] csvRecord) { SimpleFeatureType featureType = getFeatureType(); SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType); GeometryDescriptor geometryDescriptor = featureType.getGeometryDescriptor(); GeometryFactory geometryFactory = new GeometryFactory(); Double lat = null, lng = null; String[] headers = csvFileState.getCSVHeaders(); for (int i = 0; i < headers.length; i++) { String header = headers[i]; if (i < csvRecord.length) { String value = csvRecord[i].trim(); if (geometryDescriptor != null && header.equals(latField)) { lat = Double.valueOf(value); } else if (geometryDescriptor != null && header.equals(lngField)) { lng = Double.valueOf(value); } else { builder.set(header, value); } } else { builder.set(header, null); } } if (geometryDescriptor != null && lat != null && lng != null) { Coordinate coordinate; if (geometryDescriptor .getCoordinateReferenceSystem() .getCoordinateSystem() .getAxis(0) .getDirection() .equals(AxisDirection.EAST)) { coordinate = new Coordinate(lng, lat); } else { coordinate = new Coordinate(lat, lng); } Point point = geometryFactory.createPoint(coordinate); builder.set(geometryDescriptor.getLocalName(), point); } return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId); }
For our createSchema()
method, we search for the geometry column that we should have created - specified with WGS84 as the CRS - and if successful, we add our specified latField
and lngField
to the header. If unsuccessful, we throw an IOException
. The rest of the columns just use the names they were given. If we find a GeometryDescriptor
, we skip it because that was our Lat/Lon column. Everything else in this strategy is just stored as an Attribute. Finally, the header is written using the CsvWriter
.
@Override public void createSchema(SimpleFeatureType featureType) throws IOException { this.featureType = featureType; List<String> header = new ArrayList<>(); GeometryDescriptor gd = featureType.getGeometryDescriptor(); CoordinateReferenceSystem crs = gd != null ? gd.getCoordinateReferenceSystem() : null; if (gd != null && CRS.equalsIgnoreMetadata(_CRS, crs) && gd.getType().getBinding().isAssignableFrom(Point.class)) { if (crs.getCoordinateSystem().getAxis(0).getDirection().equals(AxisDirection.NORTH)) { header.add(this.latField); header.add(this.lngField); } else { header.add(this.lngField); header.add(this.latField); } } else { throw new IOException("Unable use " + this.latField + "/" + this.lngField + " to represent " + gd); } for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) { if (descriptor instanceof GeometryDescriptor) continue; header.add(descriptor.getLocalName()); } // Write out header, producing an empty file of the correct type CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ','); try { writer.writeRecord(header.toArray(new String[header.size()])); } finally { writer.close(); } }
CSVSpecifiedWKTStrategy
¶
CSVSpecifiedWKTStrategy
is the strategy used for a Well-Known-Text (WKT) format. A specified WKT must be passed to the strategy to be used to parse for the WKT.
Similar to the CSVLatLonStrategy
, a specified WKT must be passed to the strategy to be used to parse for the WKT. If found, it attaches the Geometry class to the WKT in the header.
public class CSVSpecifiedWKTStrategy extends CSVStrategy { private final String wktField; public CSVSpecifiedWKTStrategy(CSVFileState csvFileState, String wktField) { super(csvFileState); this.wktField = wktField; } // docs start buildFeatureType
To build the feature type with this strategy, the only thing that needs to be changed is updating the specified WKT field. Instead of reading this data as an Integer
, Double
or String
(as in the base CSVStrategy
class’s createBuilder()
method), we want to use a Geometry
class to store the information in the WKT Field’s column. To do this, we create an AttributeBuilder
, set our CRS to WGS84 and the binding to Geometry.class
. We get an AttributeDescriptor
from this builder, supplying it with the wktField
specified as its name. Then we set the featureBuilder
with this AttributeDescriptor
, it overwrites it with the new information.
@Override protected SimpleFeatureType buildFeatureType() { SimpleFeatureTypeBuilder featureBuilder = createBuilder(csvFileState); // For WKT strategy, we need to make sure the wktField is recognized as a Geometry AttributeDescriptor descriptor = featureBuilder.get(wktField); if (descriptor != null) { AttributeTypeBuilder attributeBuilder = new AttributeTypeBuilder(); attributeBuilder.init(descriptor); attributeBuilder.setCRS(DefaultGeographicCRS.WGS84); attributeBuilder.binding(Geometry.class); AttributeDescriptor modified = attributeBuilder.buildDescriptor(wktField); featureBuilder.set(modified); } return featureBuilder.buildFeatureType(); }
For creating the schema, the only thing we search for is a GeometryDescriptor
, which we will know is our wktField
. Otherwise, we just use the names they were given.
@Override public void createSchema(SimpleFeatureType featureType) throws IOException { this.featureType = featureType; List<String> header = new ArrayList<>(); for (AttributeDescriptor descriptor : featureType.getAttributeDescriptors()) { if (descriptor instanceof GeometryDescriptor) { header.add(wktField); } else { header.add(descriptor.getLocalName()); } } // Write out header, producing an empty file of the correct type CsvWriter writer = new CsvWriter(new FileWriter(this.csvFileState.getFile()), ','); try { writer.writeRecord(header.toArray(new String[header.size()])); } finally { writer.close(); } }
When encoding a feature, we simply parse for the wktField
described by the strategy. If found, we use a WKTWriter
to correctly write out the Geometry as a WKT field, which is then added to our CsvRecord
. Otherwise, the value is passed to a utility method convert()
which will write the value out as a String
.
@Override public String[] encode(SimpleFeature feature) { List<String> csvRecord = new ArrayList<>(); for (Property property : feature.getProperties()) { String name = property.getName().getLocalPart(); Object value = property.getValue(); if (value == null) { csvRecord.add(""); } else if (name.compareTo(wktField) == 0) { WKTWriter wkt = new WKTWriter(); String txt = wkt.write((Geometry) value); csvRecord.add(txt); } else { String txt = Converters.convert(value, String.class); csvRecord.add(txt); } } return csvRecord.toArray(new String[csvRecord.size() - 1]); }
When decoding a CsvRecord
, we check if we are in the WKT column (current header value is the wktField
specified) and if we have a GeometryDescriptor
in our featureType
. If both are true, we create a WKTReader
to read the value as a Geometry type so that we can build our feature with this Geometry. If it fails for some reason, the exception is caught and the attribute is treated as null.
@Override public SimpleFeature decode(String recordId, String[] csvRecord) { SimpleFeatureType featureType = getFeatureType(); SimpleFeatureBuilder builder = new SimpleFeatureBuilder(featureType); GeometryDescriptor geometryDescriptor = featureType.getGeometryDescriptor(); String[] headers = csvFileState.getCSVHeaders(); for (int i = 0; i < headers.length; i++) { String header = headers[i]; if (i < csvRecord.length) { String value = csvRecord[i].trim(); if (geometryDescriptor != null && header.equals(wktField)) { WKTReader wktReader = new WKTReader(); Geometry geometry; try { geometry = wktReader.read(value); } catch (ParseException e) { // policy decision here that just nulls out unparseable geometry geometry = null; } builder.set(wktField, geometry); } else { builder.set(header, value); } } else { builder.set(header, null); } } return builder.buildFeature(csvFileState.getTypeName() + "-" + recordId); }