Shapefile Plugin¶
Allows the GeoTools library to work with ESRI shapefiles.
References
ShapefileDataStoreFactory (javadocs)
Maven:
<dependency>
<groupId>org.geotools</groupId>
<artifactId>gt-shapefile</artifactId>
<version>${geotools.version}</version>
</dependency>
Connection Parameters¶
The following connection parameters are available:
Parameter |
Description |
---|---|
|
A URL of the file ending in |
|
Optional: URI to use for the |
|
Optional: Use |
|
Optional: |
|
Optional: Timezone used to parse dates in the DBF file |
|
Optional: memory map the files (not recommended for large files under windows, defaults to false) |
|
Optional: when memory mapping, cache and reuse memory maps (defaults to true) |
|
Optional: if false, won’t try to create a spatial index if missing (defaults to true) |
|
Optional: if false, the spatial index won’t be used even if available (and won’t be created if missing. |
This information is also in the javadocs .
Internally gt-shape
provides a two implementations at this time; one for simple access and another that supports the use of an index. The factory will
be able to sort out which one is appropriate when using DataStoreFinder
or
FileDataStoreFinder
.
Shapefile¶
A Shapefile is a common file format which contains numerous features of the same type. Each shapefile has a single feature type.
The classic three files:
filename.shp
: shapesfilename.shx
: shapes to attributes indexfilename.dbf
: attributes
Basic metadata:
* filename.prj
: projection
Open source extensions:
filename.qix
: quadtree spatial indexfilename.fix
: feature id indexfilename.sld
: Styled Layer Descriptor style XML object
ESRI proprietary extensions (ignored by GeoTools):
filename.sbn
: attribute indexfilename.sbx
: spatial indexfilename.lyr
: ArcMap-only style objectfilename.avl
: ArcView style objectfilename.shp.xml
: FGDC metadata
This style of file format (from the dawn of time) is referred to as “sidecar” files, at a minimum file filename.shp
and its sidecar file filename.dbf
are needed.
If the DataStore
is used for reading only, the files may be gzip-ped and marked by the additional filename extension .gz
.
If the shp
or shp.gz
file is missing, features are furnished without geometries.
Thus only a dbf
or a dbf.gz
file needs to be present.
The given URL may end in shp
, shp.gz
, dbf
or dbf.gz
Access¶
Working with an Existing Shapefile:
File file = new File("example.shp");
Map<String, Object> map = new HashMap<>();
map.put("url", file.toURI().toURL());
DataStore dataStore = DataStoreFinder.getDataStore(map);
String typeName = dataStore.getTypeNames()[0];
FeatureSource<SimpleFeatureType, SimpleFeature> source =
dataStore.getFeatureSource(typeName);
Filter filter = Filter.INCLUDE; // ECQL.toFilter("BBOX(THE_GEOM, 10,20,30,40)")
FeatureCollection<SimpleFeatureType, SimpleFeature> collection = source.getFeatures(filter);
try (FeatureIterator<SimpleFeature> features = collection.features()) {
while (features.hasNext()) {
SimpleFeature feature = features.next();
System.out.print(feature.getID());
System.out.print(": ");
System.out.println(feature.getDefaultGeometryProperty().getValue());
}
}
Creating¶
Here is a quick example:
FileDataStoreFactorySpi factory = new ShapefileDataStoreFactory();
File file = new File("my.shp");
Map<String, ?> map = Collections.singletonMap("url", file.toURI().toURL());
DataStore myData = factory.createNewDataStore(map);
SimpleFeatureType featureType =
DataUtilities.createType(
"my", "geom:Point,name:String,age:Integer,description:String");
myData.createSchema(featureType);
The featureType
created above was just done quickly, in your application you may wish to use a DefaultFeatureTypeBuilder
.
Supports:
attribute names must be 15 characters or you will get a warning:
a single geometry column named
the_geom
(stored in theSHP
file) *LineString
,MultiLineString
*Polygon
,MultiPolygon
*Point
,MultiPoint
Geometries can also contain a measure (M) value or Z & M values.
“simple” attributes (stored in the DBF file)
String max length of 255
Integer
Double
Boolean
Date -
TimeStamp
interpretation that is just the date
Limitations:
only work with
MultiLineStrings
,MultiPolygon
orMultiPoint
. GIS data often travels in herds - so being restricted to the plural form is not a great limitation.only work with fixed length strings (you will find the
FeatureType
has a restriction to help you check this, and warnings will be produced if your content ends up trimmed).Only supports a single
GeometryAttribute
Shapefile does not support plain Geometry (i.e. mixed
LineString
,Point
andPolygon
all in the same file).The shapefile maximum size is limited to 2GB (its sidecar DBF file often to 2GB, some system being able to read 4GB or more)
Dates do not support the storage of time by default. If you must store time stamps and do not need interoperability then you can enable the storage of time in date columns by setting the system property
org.geotools.shapefile.datetime
to “true”. Almost no other program will be able to read these files.
Dumping almost anything into a shapefile¶
In case the feature collection to be turned into a shapefile is not
fitting the shapefile format limitations it’s still possible to create
shapefiles out of it, at ease, leaving all the structural bridging work
to the ShapefileDumper
class.
In particular, given one or more feature collections, the dumper will:
Reduce attribute names to the DBF accepted length, making sure there are not conflicts (counters being added at the end of the attribute name to handle this).
Fan out multiple geometry type into parallel shapefiles, named after the original feature type, plus the geometry type as a suffix.
Fan out multiple shapefiles in case the maximum size is reached.
Example usage:
ShapefileDumper dumper = new ShapefileDumper(new File("./target/demo"));
// optiona, set a target charset (ISO-8859-1 is the default)
dumper.setCharset(Charset.forName("ISO-8859-15"));
// split when shp or dbf reaches 100MB
int maxSize = 100 * 1024 * 1024;
dumper.setMaxDbfSize(maxSize);
dumper.setMaxDbfSize(maxSize);
// actually dump data
SimpleFeatureCollection fc = getFeatureCollection();
dumper.dump(fc);
Force Projection¶
If you run the above code, and then load the result in a GIS application like ArcMap it will complain that the projection is unknown.
You can “force” the projection using the following code:
CoordinateReferenceSystem crs = CRS.decode("EPSG:4326");
shape.forceSchemaCRS( crs );
This is only a problem if you did not specify the CoordinateReferenceSystem
as part of your FeatureType
’s GeometryAttribute
, or if a prj
file has not been provided.
Character Sets¶
If you are working with Arabic, Chinese or Korean character sets you will need to make use of the charset
connection parameter when setting up your shapefile. The codes used here are the same as documented/defined for the Java Charset
class. Indeed you can provide a Charset
or if you provide a String it will be converted to a Charset
.
Thanks to the University of Seoul for providing and testing this functionality.
Timezone¶
The store will build dates using the default timezone. If you need to work against meteorological data the timezone has normally to be forced to “UTC” instead.
Reading PRJ¶
You can use the CRS utility class to read the PRJ
file if required. The contents of the file are in “well known text”:
CoordinateReferenceSystem crs = CRS.parseWKT(wkt);
Reading DBF¶
A shapefile is actually comprised of a core shp
file and a number of “sidecar” files. One of the sidecar files is a dbf
file used to record attributes. This is the original DBF file format provided by one of the original grandfather databases “DBase III”.
File file = new File("my.shp");
FileDataStore myData = FileDataStoreFinder.getDataStore(file);
SimpleFeatureSource source = myData.getFeatureSource();
SimpleFeatureType schema = source.getSchema();
Query query = new Query(schema.getTypeName());
query.setMaxFeatures(1);
FeatureCollection<SimpleFeatureType, SimpleFeature> collection = source.getFeatures(query);
try (FeatureIterator<SimpleFeature> features = collection.features()) {
while (features.hasNext()) {
SimpleFeature feature = features.next();
System.out.println(feature.getID() + ": ");
for (Property attribute : feature.getProperties()) {
System.out.println("\t" + attribute.getName() + ":" + attribute.getValue());
}
}
}
The GeoTools library includes just enough DBF file format support to get out of bed in the morning; indeed you should considered these facilities an internal detail to our shapefile reading code.
Thanks to Larry Reeder form the user list for supplying the following code example:
// Here's an example that should work (warning, I haven't
// tried to compile this). The example assumes the first field has a
// character data type and the second has a numeric data type:
FileInputStream fis = new FileInputStream( "yourfile.dbf" );
DbaseFileReader dbfReader = new DbaseFileReader(fis.getChannel(),
false, Charset.forName("ISO-8859-1"));
while ( dbfReader.hasNext() ){
final Object[] fields = dbfReader.readEntry();
String field1 = (String) fields[0];
Integer field2 = (Integer) fields[1];
System.out.println("DBF field 1 value is: " + field1);
System.out.println("DBF field 2 value is: " + field2);
}
dbfReader.close();
fis.close();