5. GeoTools Overview¶
The main abstraction in GeoMesa is the GeoTools DataStore
. Understanding the GeoTools API is important
to integrating with GeoMesa. The full GeoTools documentation is available here,
but this section gives a concise overview of the main ways to interact with a data store.
Note
This section is focused on users who want to integrate with GeoMesa through code. Many use cases do not require this; data can be ingested using the GeoMesa command-line tools or Apache NiFi processors, and accessed through GeoServer OGC requests or Spark. Even so, this page can provide useful background on the concepts behind those operations.
A data store provides read and write access to spatial data. The API itself does not distinguish between different storage formats. Thus, the API for accessing data stored in a local shape file will be the same as for accessing data stored in an HBase cluster. GeoMesa provides several different data stores implementations, including HBase, Accumulo, Kafka, and others. See GeoMesa Data Stores for more information on the different data stores available.
5.1. SimpleFeatureType and SimpleFeature¶
In GeoTools, a SimpleFeatureType
defines the names and types of the attributes in a given schema. It is similar
to the table definition of a relational database. SimpleFeatureType
s can be described with a type name and a
specification, typically a string indicating the attributes names and types. A SimpleFeature
is a struct data
type, equivalent to a single row in a relational database table. Each SimpleFeature
is associated with a
SimpleFeatureType
, and has a unique identifier (the feature ID) and a list of values corresponding to the
attributes in the SimpleFeatureType
. See below for examples of creating and managing SimpleFeatureType
s.
The “simple” in SimpleFeatureType
refers to its flat data structure. It is also possible to have “complex”
feature types, which are similar to joins in a relational database. However, complex feature types are not widely
used or supported.
5.2. Getting a Data Store Instance¶
Data stores are accessed through org.geotools.data.DataStoreFinder#getDataStore
. The function takes a parameter
map, which is used to dynamically load a data store. For example, to load a GeoMesa HBase data store include the
parameter key "hbase.catalog"
. Data stores are dynamically loaded; the appropriate data store implementation
and all of its required dependencies must be on the classpath.
GeoMesa data stores are thread-safe (although not all methods on the data store return thread-safe objects).
Generally, a data store should be loaded once and then used repeatedly. When no longer needed, a data store
should be cleaned up by calling the dispose()
method.
See the links in GeoMesa Data Stores for an explanation of the parameters for each data store implementation.
import org.geotools.data.DataStore;
import org.geotools.data.DataStoreFinder;
import org.locationtech.geomesa.hbase.data.HBaseDataStoreParams;
Map<String, String> parameters = new HashMap<>();
// HBaseDataStoreParams.HBaseCatalogParam().key is the string "hbase.catalog"
// the GeoMesa HBase data store will recognize the key and attempt to load itself
parameters.put(HBaseDataStoreParams.HBaseCatalogParam().key, "mycatalog");
DataStore store = null;
try {
store = DataStoreFinder.getDataStore(parameters);
} catch (IOException e) {
e.printStackTrace();
}
// when finished, be sure to clean up the store
if (store != null) {
store.dispose();
}
import org.geotools.data.DataStoreFinder
import org.locationtech.geomesa.hbase.data.HBaseDataStoreParams
import scala.collection.JavaConverters._
// HBaseDataStoreParams.HBaseCatalogParam.key is the string "hbase.catalog"
// the GeoMesa HBase data store will recognize the key and attempt to load itself
val params = Map(HBaseDataStoreParams.HBaseCatalogParam.key -> "mycatalog")
val store = DataStoreFinder.getDataStore(params.asJava)
// when finished, be sure to clean up the store
store.dispose()
5.3. Creating a Schema¶
Each data store can contain multiple SimpleFeatureType
s, or schemas. Existing schemas can be listed with
the getTypeNames
and getSchema
methods. Schemas can be created, updated and deleted through the
createSchema
, updateSchema
and removeSchema
methods, respectively.
See GeoTools Feature Types for a list of the attribute type bindings available.
import org.locationtech.geomesa.utils.interop.SimpleFeatureTypes;
import org.opengis.feature.simple.SimpleFeatureType;
try {
String[] types = store.getTypeNames();
boolean exists = false;
for (String type: types) {
if (type.equals("purchases")) {
exists = true;
break;
}
}
if (!exists) {
SimpleFeatureType myType =
SimpleFeatureTypes.createType(
"purchases", "item:String,amount:Double,date:Date,location:Point:srid=4326");
store.createSchema(myType);
}
} catch (IOException e) {
e.printStackTrace();
}
import org.locationtech.geomesa.utils.geotools.SimpleFeatureTypes
if (!store.getTypeNames.contains("purchases")) {
val myType =
SimpleFeatureTypes.createType(
"purchases", "item:String,amount:Double,date:Date,location:Point:srid=4326")
store.createSchema(myType)
}
5.4. Writing Data¶
Data stores support writing data on a row-by-row basis. There are two different write paths - appending writes and modifying writes.
Warning
Pay close attention to the use of PROVIDED_FID
in the following sections. This hint controls the behavior
of each feature ID.
Some data stores support transactions, which can be used to isolate a group of operations. GeoMesa does not
support transactions, so the default GeoTools Transaction.AUTO_COMMIT
is used in the examples. Generally,
once a writer is successfully closed, the data has been persisted to the underlying store. Until then,
data may be cached and buffered locally, and may not be persisted or available to query.
5.4.1. Appending Writes¶
An appending writer can be obtained through the getFeatureWriterAppend
method. A feature writer is similar to
an iterator; next
is called to obtain a new feature, the feature is updated with the values to be written,
and then write
is called to persist it. Once all writes are complete, the feature writer should be closed.
The ID used to uniquely identify a feature is called the feature ID, or FID
. By default, GeoTools will
generate a new feature ID for each feature. To specify a feature ID, set the PROVIDED_FID
hint in the feature
user data, as shown below.
Warning
It is a logical error to write the same feature ID more than once with an appending feature writer. This may result in inconsistencies in the persisted data. Refer to the next section for how to safely update existing features.
import org.geotools.data.FeatureWriter;
import org.geotools.data.Transaction;
import org.geotools.util.factory.Hints;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;
// use try-with-resources to close the writer when done
try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer =
store.getFeatureWriterAppend("purchases", Transaction.AUTO_COMMIT)) {
// repeat as needed, once per feature
// note: hasNext() will always return false, but can be ignored
SimpleFeature next = writer.next();
next.getUserData().put(Hints.PROVIDED_FID, "id-01");
next.setAttribute("item", "swag");
next.setAttribute("amount", 20.0);
// attributes will be converted to the appropriate type if needed
next.setAttribute("date", "2020-01-01T00:00:00.000Z");
next.setAttribute("location", "POINT (-82.379 34.1782)");
writer.write();
} catch (IOException e) {
e.printStackTrace();
}
import org.geotools.util.factory.Hints
val writer = store.getFeatureWriterAppend("purchases", Transaction.AUTO_COMMIT)
try {
// repeat as needed, once per feature
// note: hasNext will always return false, but can be ignored
val next = writer.next()
next.getUserData.put(Hints.PROVIDED_FID, "id-01")
next.setAttribute("item", "swag")
next.setAttribute("amount", 20.0)
// attributes will be converted to the appropriate type if needed
next.setAttribute("date", "2020-01-01T00:00:00.000Z")
next.setAttribute("location", "POINT (-82.379 34.1782)")
writer.write()
} finally {
writer.close()
}
An alternative way to make appending writes is to use a FeatureStore
. GeoTools defines a FeatureSource
as
read-only. FeatureStore
extends FeatureSource
and provides write functionality, but must be checked with
a runtime cast.
import org.geotools.data.simple.SimpleFeatureCollection;
import org.geotools.data.simple.SimpleFeatureSource;
import org.geotools.data.simple.SimpleFeatureStore;
import org.geotools.feature.DefaultFeatureCollection;
try {
SimpleFeatureSource source = store.getFeatureSource("purchases");
if (source instanceof SimpleFeatureStore) {
SimpleFeatureCollection collection = new DefaultFeatureCollection();
// omitted - add features to the collection
((SimpleFeatureStore) source).addFeatures(collection);
} else {
throw new IllegalStateException("Store is read only");
}
} catch (IOException e) {
e.printStackTrace();
}
import org.geotools.data.simple.SimpleFeatureStore
import org.geotools.feature.DefaultFeatureCollection
store.getFeatureSource("purchases") match {
case s: SimpleFeatureStore =>
val collection = new DefaultFeatureCollection()
collection.add(???)
s.addFeatures(collection)
case _ => throw new IllegalStateException("Store is read only")
}
5.4.2. Modifying Writes¶
In order to update an existing feature, a modifying writer must be used through the method getFeatureWriter
,
which requires a filter specifying the features to be updated. A modifying feature writer is similar to an
appending feature writer, except that the method hasNext
will return true
as long as there are additional
features to modify. The features returned from next
will be pre-populated with the current data for each feature.
Filters can be created through the GeoTools method ECQL.toFilter
. See the GeoTools
documentation for more information
on CQL filters.
import org.geotools.data.FeatureWriter;
import org.geotools.data.Transaction;
import org.geotools.filter.text.cql2.CQLException;
import org.geotools.filter.text.ecql.ECQL;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;
try (FeatureWriter<SimpleFeatureType, SimpleFeature> writer =
store.getFeatureWriter("purchases", ECQL.toFilter("IN ('id-01')"), Transaction.AUTO_COMMIT)) {
while (writer.hasNext()) {
SimpleFeature next = writer.next();
next.setAttribute("amount", 21.0);
writer.write(); // or, to delete it: writer.remove();
}
} catch (IOException | CQLException e) {
e.printStackTrace();
}
import org.geotools.data.Transaction
import org.geotools.filter.text.ecql.ECQL
val filter = ECQL.toFilter("IN ('id-01')")
val writer = store.getFeatureWriter("purchases", filter, Transaction.AUTO_COMMIT)
try {
while (writer.hasNext) {
val next = writer.next
next.setAttribute("amount", 21.0)
writer.write() // or, to delete it: writer.remove()
}
} finally {
writer.close()
}
5.5. Reading Data¶
Once data has been persisted, it can be read back through the getFeatureReader
method. GeoTools returns a “live”
iterator of results that may point to a remote location. Generally data is not actually read from the backing store
until it is required, so it is possible to read a few records without fetching the entire result set.
To filter the results that come back, predicates can be created using the “common query language”, CQL. Filters can
be created through the GeoTools method ECQL.toFilter
. See the GeoTools
documentation for more information
on CQL filters.
import org.geotools.data.DataUtilities;
import org.geotools.data.FeatureReader;
import org.geotools.data.Query;
import org.geotools.data.Transaction;
import org.geotools.filter.text.cql2.CQLException;
import org.geotools.filter.text.ecql.ECQL;
import org.opengis.feature.simple.SimpleFeature;
import org.opengis.feature.simple.SimpleFeatureType;
try {
Query query = new Query("purchases", ECQL.toFilter("bbox(location,-85,30,-80,35)"));
try (FeatureReader<SimpleFeatureType, SimpleFeature> reader =
store.getFeatureReader(query, Transaction.AUTO_COMMIT)) {
while (reader.hasNext()) {
SimpleFeature next = reader.next();
System.out.println(DataUtilities.encodeFeature(next));
}
}
} catch (IOException | CQLException e) {
e.printStackTrace();
}
import org.geotools.data.{DataUtilities, Query, Transaction}
import org.geotools.filter.text.ecql.ECQL
val query = new Query("purchases", ECQL.toFilter("bbox(location,-85,30,-80,35)"))
val reader = store.getFeatureReader(query, Transaction.AUTO_COMMIT)
try {
while (reader.hasNext) {
val next = reader.next
println(DataUtilities.encodeFeature(next))
}
} finally {
reader.close()
}