20.8. Partition Schemes

Partition schemes define how data is stored on the filesystem. The scheme is important because it determines how the data is queried. When evaluating a query filter, the partition scheme is leveraged to prune data files that do not match the filter. There are three main types of partition schemes provided: spatial, temporal and attribute.

The partition scheme must be provided when creating a schema. The scheme is defined by a well-known name and a map of configuration options. See Configuring the Partition Scheme for details on how to specify a partition scheme.

20.8.1. Composite Schemes

Composite schemes are hierarchical combinations of other schemes. A composite scheme is named by concatenating the names of the constituent schemes, separated with commas, e.g. hourly,z2-2bits. The configuration options for each child scheme should be merged into a single configuration for the composite scheme.

20.8.2. Temporal Schemes

Temporal schemes lay out data based on a Java DateTime format string, separated by forward slashes, which is used to build a directory structure. All temporal schemes support the following common configuration option:

  • dtg-attribute - The name of a Date-type attribute from the SimpleFeatureType to use for partitioning data. If not specified, the default date attribute is used.

20.8.2.1. Date-Time Scheme

Name: datetime

Configuration:

  • datetime-format - A Java DateTime format string, separated by forward slashes, which will be used to build a directory structure. For example, yyyy/MM/dd.

  • step-unit - A java.time.temporal.ChronoUnit defining how to increment the leaf of the partition scheme

  • step - The amount to increment the leaf of the partition scheme. If not specified, defaults to 1

The date-time scheme provides a fully customizable temporal scheme.

20.8.2.2. Hourly Scheme

Name: hourly

The hourly scheme partitions data by the hour, using the layout yyyy/MM/dd/HH.

20.8.2.3. Minute Scheme

Name: minute

The minute scheme partitions data by the minute, using the layout yyyy/MM/dd/HH/mm.

20.8.2.4. Daily Scheme

Name: daily

The daily scheme partitions data by the day, using the layout yyyy/MM/dd.

20.8.2.5. Weekly Scheme

Name: weekly

The weekly scheme partitions data by the week, using the layout yyyy/ww.

20.8.2.6. Monthly Scheme

Name: monthly

The monthly scheme partitions data by the month, using the layout yyyy/MM.

20.8.2.7. Julian Schemes

Names: julian-minute, julian-hourly, julian-daily

Julian schemes partition data by Julian day, instead of month/day. They use the patterns yyyy/DDD/HH/mm, yyyy/DDD/HH, and yyyy/DDD respectively

20.8.2.8. Receipt Time Scheme

Name: receipt-time

Configuration:

  • datetime-scheme - The name of another date-time scheme describing the layout of the data, e.g. weekly or hourly. Additional options may be required to configure the date-time scheme selected.

  • buffer - The amount of time to buffer queries by, expressed as a duration, e.g. 30 minutes. This represents the latency in the system.

The receipt time scheme partitions data based on when a message is received. Generally this is useful only for reading existing data that may have been aggregated and stored by an external process.

20.8.3. Spatial Schemes

Spatial schemes lay out data based on a space-filling curve. All spatial schemes support the following common configuration option:

  • geom-attribute - The name of a Geometry-type attribute from the SimpleFeatureType to use for partitioning data. If not specified, the default geometry is used.

20.8.3.1. Z2 Scheme

Name: z2

Configuration:

  • z2-resolution - The number of bits of precision to use for z indexing. Must be a multiple of 2.

The Z2 scheme uses a Z2 space-filling curve, and can only be used with Point-type geometries. Instead of specifying the resolution as a configuration option, it may be specified in the name, as z2-<n>bits, where <n> is replaced with the Z2 resolution, e.g. z2-2bits.

20.8.3.2. XZ2 Scheme

Name: xz2

Configuration:

  • xz2-resolution - The number of bits of precision to use for z indexing. Must be a multiple of 2.

The XZ2 scheme uses an XZ2 space-filling curve, and can be used with any geometry type. Instead of specifying the resolution as a configuration option, it may be specified in the name, as xz2-<n>bits, where <n> is replaced with the XZ2 resolution, e.g. xz2-2bits.

20.8.4. Attribute Schemes

Attribute schemes lay out data based on a lexicoded attribute value.

Name: attribute

Configuration:

  • partitioned-attribute - The name of an attribute from the SimpleFeatureType to use for partitioning data.