17.2. Kafka Data Store Parameters¶
The Kafka data store differs from most data stores in that the data set is kept entirely in memory. Because of this, the in-memory indexing can be configured at runtime through data store parameters. See Kafka Index Configuration for more information on the available indexing options.
Because configuration options can reference attributes from a particular SimpleFeatureType, it may be necessary to create multiple Kafka data store instances when dealing with multiple schemas.
Use the following parameters for a Kafka data store (required parameters are marked with *
):
Parameter |
Type |
Description |
---|---|---|
|
String |
Kafka brokers, e.g. |
|
String |
Kafka zookeepers, e.g |
|
String |
The Kafka topic used to store schema metadata (when not using Zookeeper) |
|
String |
Zookeeper discoverable path, can be used to namespace feature types (when using Zookeeper) |
|
String |
Configuration options for kafka producer, in Java properties format. See Producer Configs |
|
Boolean |
Send a ‘clear’ message on startup. This will cause clients to ignore any data that was in the topic prior to startup |
|
String |
Configuration options for kafka consumer, in Java properties format. See Consumer Configs |
|
String |
On start up, read messages that were written within this time frame (vs ignore old messages), e.g.
|
|
Integer |
Number of kafka consumers used per feature type. Set to 0 to disable consuming (i.e. producer only) |
|
String |
Prefix to use for kafka group ID, to more easily identify particular data stores |
|
Boolean |
The default behavior is to start consuming a topic only when that feature type is first requested. This can reduce load if some layers are never queried. Note that care should be taken when setting this to false, as the store will immediately start consuming from Kafka for all known feature types, which may require significant memory overhead. |
|
Integer |
Number of partitions to use in new kafka topics |
|
Integer |
Replication factor to use in new kafka topics |
|
String |
Internal serialization format to use for kafka messages. Must be one of |
|
String |
Expire features from in-memory cache after this delay, e.g. “10 minutes”. See Feature Expiration |
|
String |
Expire features dynamically based on CQL predicates. See Feature Expiration |
|
String |
Instead of message time, determine expiry based on feature data. See Feature Event Time |
|
Boolean |
Instead of message time, determine feature ordering based on the feature event time. See Feature Event Time |
|
String |
Use CQEngine-based attribute indices for the in-memory feature cache. See CQEngine Indexing |
|
Integer |
Number of bins in the x-dimension of the spatial index, by default 360. See Spatial Index Resolution |
|
Integer |
Number of bins in the y-dimension of the spatial index, by default 180. See Spatial Index Resolution |
|
String |
Number and size of tiers used for indexing geometries with extents, in the form |
|
Boolean |
Use lazy deserialization of features. This may improve processing load at the expense of slightly slower query times |
|
String |
Additional views on existing schemas to expose as layers. See Layer Views for details |
|
String |
Reporters used to publish Kafka metrics, as TypeSafe config. To use multiple reporters, nest
them under the key |
|
Boolean |
Use loose bounding boxes, which offer improved performance but are not exact |
|
Boolean |
Audit incoming queries. By default audits are written to a log file |
|
String |
Default authorizations used to query data, comma-separated |
17.2.1. Programmatic Access¶
An instance of a Kafka data store can be obtained through the normal GeoTools discovery methods, assuming that the GeoMesa code is on the classpath.
Map<String, Serializable> parameters = new HashMap<>();
parameters.put("kafka.brokers", "localhost:9092");
org.geotools.api.data.DataStore dataStore =
org.geotools.api.data.DataStoreFinder.getDataStore(parameters);
More information on using GeoTools can be found in the GeoTools user guide.