19.7. Shapefile Converter¶
The Shapefile converter handles shapefiles. To use the Shapefile converter, specify type = "shp"
in your converter
definition.
19.7.1. Configuration¶
The Shapefile converter does not have any configuration beyond the basic converter defaults. However, since a shapefile is a collection of files, and not a single input, there are a few constraints that must be observed.
Shapefile converters are not usable in distributed converter jobs, as the map/reduce paradigm does not work
well with the collection of related files that comprise a shapefile. In addition, when using a Shapefile converter
it is important to set the input file path in the evaluation context. This is handled automatically by the GeoMesa
CLI tools, but if used programmatically "inputFilePath"
must be set in the evaluation context global parameters.
19.7.2. Shapefile Transform Functions¶
The transform
element supports referencing each attribute in the input shapefile by its column number using
$
. $0
refers to the feature ID, then the first attribute is $1
, etc. Each attribute will be typed
according to the schema of the shapefile.
Additionally, the attributes of the input shapefile may be referenced by name using the shp
function.
The feature ID can be referenced by using the shpFeatureId
function.
The standard functions in Transformation Function Overview can be used as normal.
19.7.2.1. shp¶
For ease of use, this will access an attribute in the shapefile by name, e.g. shp('my_attribute')
. Note
that attributes can also be referenced by number, as described above.
19.7.2.2. shpFeatureId¶
For ease of use, this will access the feature ID of the current shapefile feature, e.g. shpFeatureId
. Note
that the feature ID can also be referenced by $0
, as described above.
19.7.3. Example Usage¶
This example is for the Tiger US States boundary file available from the US Census Bureau.
The shapefile has the following columns:
*the_geom:MultiPolygon,STATEFP:String,STATENS:String,AFFGEOID:String,GEOID:String,STUSPS:String,NAME:String,LSAD:String,ALAND:Long,AWATER:Long
From these, we will only consider a subset, using the following SimpleFeatureType:
geomesa.sfts.cb_2017_us_state_20m = {
attributes = [
{ name = "name", type = "String" }
{ name = "usps", type = "String" }
{ name = "area", type = "Long" }
{ name = "geom", type = "MultiPolygon" }
]
}
You could ingest with the following converter:
geomesa.converters.cb_2017_us_state_20m = {
type = "shp"
id-field = "$0"
fields = [
{ name = "name", transform = "$7" }, // example of lookup by field number
{ name = "usps", transform = "shp('STUSPS')" }, // example of lookup by name
{ name = "area", transform = "add(shp('ALAND'),shp('AWATER'))" },
{ name = "geom", transform = "shp('the_geom')" },
]
}