17.10.8. T-Drive Trajectory Data¶
This directory provides T-Drive GeoMesa ingest commands and converter configuration files.
T-Drive is a project of Microsoft Research Asia. The overall project collected GPS tracks from over 30,000 taxis in Beijing for three months. These data were used to demonstrate a more efficient routing system.
Microsoft makes a small subsample of its data available for download. This is about one third of the taxis for a period of a week.
Before proceeding please check out the data description and included terms of use.
17.10.8.1. Getting T-Drive data¶
The T-Drive data set can be downloaded using the provided
download-data.sh
script in $GEOMESA_ACCUMULO_HOME/bin/
as such
./download-data.sh tdrive
Alternatively, download the T-Drive datasets from
here download
one or all of the zip files, then unzip
into a convenient directory.
Each zip contains several hundred CSVs for a total of 7,952 files and nearly 15 million data points.
Each CSV is formatted as follows: > Taxi Identifier, Timestamp, Longitude, Latitude Example: > 102, 2008-02-02 13:34:27,116.30826,39.94702
17.10.8.2. Ingest Commands¶
Check that the tdrive
simple feature type is available on the GeoMesa
tools classpath. This is the default case.
$ geomesa-accumulo env | grep tdrive
If it is not, merge the contents of reference.conf
with
$GEOMESA_ACCUMULO_HOME/conf/application.conf
, or ensure that
reference.conf
is in $GEOMESA_ACCUMULO_HOME/conf/sfts/tdrive
.
Run the ingest. You may optionally point to a different accumulo
instance using -i
and -z
options. See geomesa-accumulo help ingest
for more detail.
$ geomesa-accumulo ingest -u USERNAME -c CATALOGNAME -s tdrive -C tdrive tdrive_data.txt
Any errors during the ingest will be logged to $GEOMESA_ACCUMULO_HOME/logs
.