15.5. Accumulo Configuration

This section details Accumulo specific configuration properties. For general properties, see Runtime Configuration.

15.5.1. General Properties

15.5.1.1. geomesa.accumulo.table.cache.expiry

The expiry to cache the existence of tables, defined as a duration, e.g. 60 seconds or 100 millis. To avoid frequent checks for the existence of tables before writing, tables checks are cached. If tables are deleted without stopping any ingest, they will not be re-created until the cache expires.

Default is 10 minutes.

15.5.1.2. geomesa.accumulo.table.sync

Sets the level of synchronization when creating and deleting tables. When using tables backed by S3, synchronization may prevent table corruption errors in Accumulo. Possible values are:

  • zookeeper (default) - uses a distributed lock that works across JVMs.

  • local - uses an in-memory lock that works within a single JVM.

  • none - does not use any external locking. Generally this is safe when using tables backed by HDFS.

The synchronization level may be adjusted depending on the architecture being used - for example, if tables are created by a single-thread, then a system may safely disable synchronization.

15.5.2. Batch Writer Properties

The following properties control the configuration of Accumulo BatchWriters. They map directly to the underlying BatchWriter methods.

15.5.2.1. geomesa.batchwriter.latency

The latency is defined as a duration, e.g. 60 seconds or 100 millis. See the Accumulo API for details.

15.5.2.2. geomesa.batchwriter.maxthreads

Determines the max threads used for writing. See the Accumulo API for details.

15.5.2.3. geomesa.batchwriter.memory

The memory is defined in bytes, e.g. 10mb or 100kb. See the Accumulo API for details.

15.5.2.4. geomesa.batchwriter.timeout.millis

The timeout is defined as a duration, e.g. 60 seconds or 100 millis. See the Accumulo API for details.

15.5.3. Remote Processing Properties

The following properties control the push-down processing for certain queries to the Accumulo tablet servers, vs processing in the client. Enabling push-down processing can result in faster queries, but also puts additional load on the Accumulo cluster, which may negatively impact concurrent clients.

See also ref:accumulo_parameters for configuring these properties directly in the data store parameters.

15.5.3.1. geomesa.accumulo.remote.arrow.enable

Enable processing Arrow encoding in Accumulo tablets servers as a distributed call, instead of encoding locally in the client. Default is true.

15.5.3.2. geomesa.accumulo.remote.bin.enable

Enable processing binary encoding in Accumulo tablets servers as a distributed call, instead of encoding locally in the client. Default is true.

15.5.3.3. geomesa.accumulo.remote.density.enable

Enable processing heatmap encoding in Accumulo tablets servers as a distributed call, instead of encoding locally in the client. Default is true.

15.5.3.4. geomesa.accumulo.remote.stats.enable

Enable processing statistical calculations in Accumulo tablets servers as a distributed call, instead of encoding locally in the client. Default is true.

15.5.4. Map Reduce Input Splits Properties

The following properties control the number of input splits for a map reduce job. See the Accumulo User Manual for details.

15.5.4.1. geomesa.mapreduce.splits.max

Set the absolute number of splits when configuring a mapper instead of allowing Accumulo to create a split for each range or basing it on the number of tablet servers.

Setting this value overrides geomesa.mapreduce.splits.tserver.max.

15.5.4.2. geomesa.mapreduce.splits.tserver.max

Set the max number of splits per tablet server when configuring a mapper. By default this value is calculated using Accumulo’s AbstractInputFormat.getSplits method which creates a split for each range. In some scenarios this may create an undesirably large number of splits.

This value is overwritten by geomesa.mapreduce.splits.max if it is set.

15.5.5. Zookeeper Session Timeout

15.5.5.1. instance.zookeeper.timeout

The Zookeeper timeout is defined in milliseconds, according to the Accumulo specification. See the Accumulo User Manual for details.