4. Getting Started¶
The first step to getting started with GeoMesa is to choose a persistent storage solution. This may be dictated by your target environment, but if not there are several options available.
If you want a near real-time view of streaming data, then consider using Kafka or Redis.
Otherwise, you can get similar functionality through HBase, Accumulo, Cassandra, Google Bigtable or Apache Kudu. HBase and Accumulo support distributed processing, so may be faster for certain operations. HBase and Cassandra are the most widely-used technologies, while Accumulo is often chosen for its advanced security features.
Another option is the FileSystem data store, which has a very low barrier to entry, and can read existing data in a variety of file formats. The FileSystem data store can provide extremely low-cost storage when backed by cloud-native object stores; however, it generally is not as performant as using an actual database.
For advanced use cases, multiple stores can be combined through Combined Data Store Views to provide both high performance (for recent data) and low cost (for older data).
Whichever storage solution you choose, the GeoMesa API is the same (outside of some back-end-specific configuration options). For most users, the back-end can be swapped out with minimal code changes.
4.1. Quick Starts¶
The GeoMesa quick start tutorials are the fastest and easiest way to get started with GeoMesa. They are a good stepping-stone on the path to the other tutorials that present increasingly involved examples of how to use GeoMesa. The tutorials show how to write custom Java code to ingest and query data with GeoMesa, and visualize the changes being made in GeoServer.
4.2. Docker Images¶
The Geodocker project provides Docker images that make it easy to stand up an Accumulo cluster with GeoMesa already configured. This guide describes how to bootstrap a cluster using Amazon ElasticMapReduce (EMR) and Docker in order to ingest and query sample GDELT data.
4.3. Data Ingestion¶
GeoMesa provides an ingestion framework that can be configured using JSON, which means that your data can be ingested without writing any code. This makes it quick and easy to get started with your custom data formats, and updates can be handled on-the-fly, without code changes.
4.4. GeoJSON¶
GeoMesa provides built-in integration with GeoJSON. GeoMesa provides a GeoJSON API that allows for the indexing and querying of GeoJSON data without using the GeoTools API – all data and operations are pure JSON. The API also includes a REST endpoint for web integration.
4.5. Spark¶
GeoMesa provides spatial functionality on top of Spark and Spark SQL. To get started, see Data Analysis.