8.13. Age-Off Iterators¶
This chapter provides documentation on how to configure the beta feature for data age-off via Accumulo iterators. Age-off allows administrators to set retention periods on data (e.g. 3 months) in order to automatically hide and delete data from tables without manually deleting features.
Currently there are two types of age-off supported - attribute based, and ingest-time based.
8.13.1. Installation¶
The age-off iterators are provided as part of the GeoMesa Accumulo Distributed Runtime jar which can be found in the Installing GeoMesa Accumulo chapter.
8.13.2. Requirements¶
Age-off iterators are applied to individual Accumulo tables - as such, to configure age-off on a simple feature
type, table sharing must be disabled. This may be accomplished by setting the user data string
geomesa.table.sharing='false'
on the simple feature type before calling createSchema
. This check may be
overridden by setting the system property geomesa.age-off.override=true
. This may be desirable for configuring
ingest time age-off on multiple simple feature types, or when it is known that only a single simple feature type
is present in a catalog. Note that configuring attribute-based age-off on multiple features in a shared catalog
will generally not work.
8.13.3. Configuration¶
Age-off can be configured through the GeoMesa command-line tools, using the configure-age-off
command.
See Accumulo Command-Line Tools for an overview of the command-line tools.
Warning
If age-off iterators have previously been configured manually, it is suggested to remove them before using any provided tools, in order to ensure consistency in naming and priority.
8.13.3.1. Viewing Current Age-Off¶
Any configured age-off iterators can be shown via the command line tools configure-age-off
command:
$ geomesa-accumulo configure-age-off -c test_catalog -f test_feature --list
INFO Attribute age-off: None
INFO Timestamp age-off: name:age-off, priority:10, class:org.locationtech.geomesa.accumulo.iterators.AgeOffIterator, properties:{retention=PT1M}
Note that this may not display any manually configured age-off iterators, as the iterator name may not match.
8.13.3.2. Ingest Time Age-Off¶
Ingest time age-off will use the timestamp associated with the Accumulo data value to validate retention.
Note
Ingest time age-off requires that the Accumulo tables are configured with system timestamps, instead of the default logical timestamps. See Accumulo Logical Timestamps for more information.
Ingest time age-off can be configured via the command line tools configure-age-off
command, without specifying
a date attribute:
$ geomesa-accumulo configure-age-off -c test_catalog -f test_feature --set --expiry '1 day'
This will remove any existing age-off configuration and replace it with the new specification.
8.13.3.3. Attribute Age-Off¶
Attribute age-off will use a date-type attribute to validate retention.
Attribute age-off can be configured via the command line tools configure-age-off
command by specifying
a date attribute:
$ geomesa-accumulo configure-age-off -c test_catalog -f test_feature --set --expiry '1 day' --dtg my_date_attribute
This will remove any existing age-off configuration and replace it with the new specification.
8.13.3.4. Removing Age-Off¶
Any configured age-off iterators can be cleared via the command line tools configure-age-off
command:
$ geomesa-accumulo configure-age-off -c test_catalog -f test_feature --remove
This will remove both attribute and ingest time age-off.
8.13.4. Statistics¶
As features are aged off, summary data statistics will get out of date, which can degrade query planning. For manageable data sets, it is recommended to re-analyze statistics every so often, via the stats-analyze command. If the data set is too large for this to be feasible, then stats can instead be disabled completely via geomesa.stats.generate.
8.13.5. Forcing Deletion of Records¶
The GeoMesa age-off iterators will not fully delete records until compactions occur. To force a true deletion of data on disk, you must manually compact a table or range. When compacting an entire table you should take care not to overwhelm your system. You may start a compaction through the Accumulo shell:
# compact a single table
compact -t geomesa.mycatalog_mytype_z2
# compact all tables for a catalog
compact -p geomesa.mycatalog.*